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ABSTRACT 



This report describes the development and application 
of a program to forecast important air/ocean parameters using 
the method (s) of model output statistics. The focus of this 
operationally oriented study is to forecast atmospheric 
marine horizontal visibility using a discrete analysis of 
observed visibility and the Navy's Operational Global Atmos- 
pheric Prediction System (NOGAPS) model output parameters. 

Three strategies (two based on maximum- probability and one 
based on natural-regression) are compared to two multiple 
linear regression methods. The primary data set is from a 
North Atlantic Ocean area bounded approximately by the North 
American coast from Norfolk, Va. to St. Johns, Newfoundland, 
and then eastward to about 37.5°W. Both the dependent and 
independent data were derived from the same basic set. New 
or unfamiliar concepts, in addition to the primary methodology, 
include the statistical division of the North Atlantic Ocean 
into physically homogeneous areas, two new threshold models 
for the application of linear regression equations, linear 
regression based upon a 'decision-tree' concept, functional 
dependence of predictors and class errors. Results show 
that the methodology proposed by Preisendorfer does out 
perform multiple linear regression. 
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I. 



INTRODUCTION AND BACKGROUND 



Model output statistics (MOS) is a technique whereby 
parameters output from numerical weather prediction models 
(predictors) are statistically processed, with observed 
data, to produce forecasts of one of the following cate- 
gories of parameters (as predictands) ; 

a. operationally important parameters not output by the 
numerical prediction model (e.g., visibility, cloud 
cover, ceiling) . 

b. model output parameters whose predictive skill is 
improved (e.g., surface wind, temperature) due to 
correction of numerical model bias and/or scale. 

Historically, the methodology has consisted of generating 
empirical equations by a linear, least-squares regression 
model. This technique is used by both the National Weather 
Service and the United States Air Force Air Weather Service 
and has demonstrated operationally usable skill in forecast- 
ing nxxmerous weather elements at locations over land 
throughout the world [Best and Pryor, 1983]. Attempts by 
the United States Navy to forecast open-ocean fog and visi- 
bility using linear regression equations have shown skills 
of marginal operational usefulness but exceeding those of 
persistence and climatology [Aldinger, 1979; Yavorsky, 1980; 
Selsor, 1980; Koziara et al, 1983; Renard and Thompson, 1984]. 
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Presumably, this level of performance is due, in part, to 
the lack of 'calibrated' fog and visibility observations. 
Shipboard weather observers lack sufficient reference points 
to be able to accurately estimate the range of atmospheric 
visibility. 

In the spring of 1983, the United States Navy made the 
decision to begin development of a MOS program to forecast 
operational air/ocean parameters over the oceans of the 
world. Primarily, because of the importance of horizontal 
visibility to the mariner, this parameter was elected to be 
the initial candidate. However, because of less-than-per feet 
prior results using linear regression in the North Pacific 
Ocean, it was decided to investigate other methodologies 
to determine if a better one could be found. 

This study presents statistical methodologies proposed by 
Preisendorfer (1983 a,b,c). Specifically, three strategies, 
two based on maximum-probability and one based on natural- 
regression, are further developed, tested and applied to sets 
of model output parameters from both the North Pacific and 
North Atlantic Ocean areas. In addition, multiple linear 
regression is applied to the same data. Innovative threshold 
techniques, developed by Lowe (1984a) , are also applied, and 
methodologies are compared. 

In the following discussion, a sufficient number of terms 
and symbols are defined to allow readers without strong 
statistical backgrounds to understand the results. However, 
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for a proper understanding of the Preisendorfer (1933 a,b,c) 
methodology, readers are encouraged to read Appendix A, 
which contains a detailed discussion. Similarly, details on 
the linear regression model and threshold procedures [Lowe, 
1984a) are to be found in Appendix B. 
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II. OBJECTIVE AND APPROACH 



The objective of this study is to determine if a statis 
tical methodology, applied to discrete values of model 
output and derived parameters, can improve upon the fore- 
casting of horizontal marine atmospheric visibility when 
compared to linear regression. The approach is as follows: 

a. define categorical groupings of visibility which 
relate to operational use at sea. 

b. develop and apply the Preisendorfer (1983 a,b,c) 
methodology using July 1979 North Pacific Ocean data. 

c. apply the methodology developed in b. above to June 
1983 North Atlantic Ocean data. 

d. compare Preisendorfer (1983 a,b,c) results to those 
of the Lowe (1984a) linear regression approach for 
the North Pacific , and North Atlantic Ocean data sets 
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III. DATA 



A. VISIBILITY OBSERVATIONS AND SYNOPTIC CODE 

Visibility observations at sea are reported as one of 
ten synoptic codes, ranging from 90 (visibility less than 
50 m) to 99 (visibility equal to or greater than 10 km) . 
However, in view of the inexactness of observing and record- 
ing marine visibility, in category form, and the further 
degradation of its interpretation by users in forecasting, 
a simplified categorization of visibility was developed as 
follows: 

category synoptic code visibility range 



I 


90-94 


< 2 km 


II 


95-96 


> 2 km and < 10 km 


III 


97-99 


> 10 km 



This scheme is based upon the following operational 
criteria, which applies when observed visibility falls below 
the indicated value: 

1. 10 km (5 n mi) — United States Navy aircraft carrier 
flight recovery operations change from visual to con- 
trolled approach [Department of the Navy, 1979]. 

2. 2 km (1 n mi) — sounding of reduced visibility signals 
for all vessels operating in international waters. 

(The term 'reduced visibility' is not defined in the 
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International Regulations for Preventing Collisions at 
Sea, 1972. However, United States Navy Captains and 
Merchant Marine Masters generally consider it to be 
1 n mi . ) 

B. NORTH PACIFIC OCEAN DATA 

The data from the North Pacific Ocean are described by 
Selsor (1980) and Koziara et al (1983) . Only the July 1979 
model initialization (TAUOO) data are used, consisting of 19 
model output parameters (MOP) from the Northern Hemisphere 
models operational in 1979, namely, the Mass Structure Analy- 
sis, the Primitive Equation and the Marine Wind Models; and 
one climatological visibility parameter from the National 
Oceanic and Atmospheric Administration's National Climatic 
Data Center (NCDC) , Asheville, North Carolina. Two additional 
parameters were derived from this set. A description of the 
parameters is found in Appendix C. 

C. NORTH ATLANTIC OCEAN DATA 
1 . Area 

The North Atlantic Ocean, from 0° to 80 °N, was 
divided into physically homogeneous areas by Lowe (1984b) 
using an appropriate cluster analysis technique. The primary 
area used in this study is identified as area 3W on Fig. 1, 
which illustrates the North Atlantic Ocean homoegeneous areas. 
This area was chosen because of the relatively frequent 
occurrence of poor visibility as compared to the other areas. 
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A summary of visibility frequencies, for each homogeneous 
area and three visibility categories, is contained in Table I. 

2 . Time Period 

Data from 15 May 1983 through 15 July 1983 were 
combined to form the June 1983 data set, hereafter referred 
to as FATJUNE. FATJUNE was chosen as the initial data set 
because of the high frequency of occurrence of poor visi- 
bility during this period. In order to maximize the credi- 
bility of visibility observations, 1200 GMT synoptic ship 
report data were used exclusively since this time corresponds 
to daylight over the entire area of study during FATJUNE. 

Model output parameter data (predictors) at 1200 GMT 
model output time, hereafter referred to as TAUOO , were used 
in the development of the Preisendorfer (1983 a,b,c) methodology, 
time not being available to pursue the scheme beyond that 
stage. Thus, TAUOO represents model initialization time. 

However, the term 'forecast' will be used throughout this 
study to represent the estimate of visibility at this 
initialization time. 

3 . Synoptic Weather Reports 

All synoptic visibility observations (predictand 
data) for this study were quality-control checked and pro- 
vided by the Naval Oceanography Command Detachment (NOCD) 
co-located with the NCDC. Those furnished observations which 
contain systematic observer error or are suspect or obviously 
erroneous, as determined from the data quality indicators, 
are not incorporated in the final data set. 
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4. 



Predictor Parameters 



Fifty TAUOO model output parameters (MOP's) (predic- 
tor data) were provided for the period of study by the Fleet 
Numerical Oceanography Center (FNOC) , Monterey, California. 
These parameters are from their current operational prediction 
model, the Navy Operational Global Atmospheric Prediction 
System (NOGAPS). All MOP's were interpolated from model grid 
coordinates to synoptic ship observation positions using a 
linear interpolation scheme. Of the 50 parameters provided, 
only 35 were used in the development of the Preisendorfer 
(1983 a,b,c) and Lowe (1984a) methodologies, the remainder 
being considered as either having little likelihood of 
importance in the forecasting of visibility or not usable 
due to the lack of significant digits (which were lost during 
the transfer from FNOC tapes to the main computer center's 
mass storage data system) . Twelve additional parameters were 
derived from the interpolated MOP's. Seven of these are 
equations derived from a linear regression model which will 
be described in Chapter V and Appendix B. Each equation 
represents an estimate of the visibility category, which is 
used as a predictor. A list of all of the predictor param- 
eters is provided in Appendix D. 

D. DEPENDENT/INDEPENDENT DATA SETS 

Due to the limited amount of data available to this 
study for each of the North Atlantic Ocean homogeneous 
areas, it was necessary to withhold one-third of the 
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observations from the developmental model to use as an inde- 
pendent data set. This was accomplished by the use of a 
counter and transfer statement in the computer programs which 
prevented every third observation from entering the develop- 
mental computations. To ensure that the dependent and inde- 
pendent data were representative of the same population, a 
95% confidence interval for proportions [Miller and Freund, 
1977] was established from the entire data set, for each 
visibility category, and the dependent and independent data 
sets were constrained to have visibility frequencies within 
these established confidence intervals. This same procedure 
was applied to the North Pacific Ocean data for consistency of 
method. Table II s\immarizes the dependent and independent 
data for both the North Atlantic Ocean and North Pacific 
Ocean data sets . 
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IV. PRELIMINARY EXPERIMENTS 



A. TERMS AND SYMBOLS 

The terms and statistical symbols defined below will be 
used throughout the remainder of this report. The formal 
mathematical definitions can be found in Appendices A and 

E. 

1. Maximum-probability strategy — choosing forecast 
visibility categories based upon the highest conditional 
probabilities of visibility within a predictor interval. 

2. MAXPROBl — designation of the maximum-probability 
strategy in which ties of the highest conditional 
probabilities in a predictor interval are resolved by 
the generation of a random number. 

3. MAXPR0B2 — designation of the maximum- probability 
strategy in which ties of the highest conditional 
probabilities in a predictor interval are resolved by 
assigning the lowest visibility category, of those 
tied, as the forecast category. 

4. Natural-regression strategy — choosing forecast visi- 
bility categories based upon the statistical average 
of the conditional probabilities of visibility within 
a predictor interval. 

5. a^ — the probability of a zero-class visibility category 
forecast error (e.g., if visibility category I is fore- 
cast, it is also observed) . 
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6 . 



— the probability of a one-class visibility category 
forecast error (e.g., if visibility category I is 
forecast and category II is observed) . 

7. a 2 ~-the probability of a two-class visibility category 
forecast error (e.g., if visibility category I is 
forecast and category III is observed) . 

8. CE — class error parameter defined as a^^ + 2a2/ used to 
identify the first predictor. 

9. PP — the potential predictability of visibility by 
any given predictor. 

10. FD — the functional dependence of one predictor on 
another. This is a measure of functional dependence 
of a statistical kind and not of the deterministic 
kind. The term 'functional dependence' is used by 
Preisendorfer (1983c) and, being sufficiently descrip- 
tive of the concept, it will be used herein. 

11. RSS FD--root sum squared FD. The functional dependence 
of a predictor on all predictors already included in 
the developmental model. It is equal to the square- 
root of the sum of the squares of the individual FD's. 

12. TSl — threat score for visibility category I computed 
from a contingency table. 

13. ATSl — adjusted threat score for visibility category 

I which removes the influence of the data set category 
frequency . 
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14. AAO — adjusted . A contingency table statistic 

which removes the influence of the most frequent visi- 
bility category in a set of data (similar to a nor- 
malized value) . 

15. EPI — equally populous predictor interval used to 
discretize the predictors. 

B . COMPUTER PROGRAMS 

Four computer programs were developed to test the 
proposed Preisendorfer (1983 a,b,c) methodology. The 
programs are on file in the Department of Meteorology, Naval 
Postgraduate School, Monterey, California, 93943. 

1. A program to compute a^ , a^^, CE and PP for all predic- 
tors, all strategies (MAXPROBl, MAXPR0B2 and Natural- 
Regression) and a single number of EPI's. Statistics 
for the three strategies are based upon the same pre- 
dictor(s) rather than the best predictor(s) for each 
strategy. It was determined during program development, 
and will be shown in Chapter VI, that, in general, each 
of the strategies chose the same predictor (s) . 

2 . A program to compute FD for all predictors , on a given 
predictor, for a given number of EPI's, and to compute 
the upper 5% critical value (FD(96)) by Monte-Carlo 
means (Appendix A) . 

3. A program to construct contingency tables and to com- 
pute skill and threat scores, for both the dependent 
and independent data. 
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4. A program to generate 100 random data sets, from the 
marginal probabilities of the predictor (s) in the 
developmental model, and to compute upper and lower 
5% critical values for a^ and a^ to be used for test- 
ing the significance of the results from the Preisen- 
dorfer (1983 a,b) methodology against chance. 

C. BEHAVIOR OF a^ AND THREAT SCORES 

Before attempting a formal application of the Preisen- 
dorfer (1983 a,b,c) methodology, it was considered prudent 
to investigate the behavior of certain statistics as the 
number of equally populous predictor intervals was changed 
and as new predictors were added. It was found, during 
program testing and before a formal procedure had been estab- 
lished, that the independent data threat score of visibility 
category I (TSl) generally showed higher values than other 
threat scores (TS2, TS12) for the independent data. There- 
fore, it was decided that the dependent and independent data 
a^ and TSl scores would be compared. The statistic a^ was 
chosen because it is the singularly most important scoring 
parameter in the Preisendorfer methodology. 

The experiment consisted of choosing the first predictor 
as that one which gave the highest value when divided 
into ten equally populous intervals. Once this predictor 
was chosen, dependent and independent data a^ and TSl scores 
were computed for each number of intervals as the number was 
varied from two to 100. Prior to proceeding to the next 
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step, the number of intervals which gave the highest indepen- 
dent data TSl score was identified and the first predictor 
was held at this number of intervals for the remainder of 
the experiment. 

Subsequent predictors were chosen by both a maximum a^ 
test and a functional dependence test. As each subsequent 
predictor was identified, its number of equally populous 
intervals was varied from two to 50 (or less, as the maximum 
array size was set at 120,000) . The number of equally popu- 
lous intervals giving the highest independent data TSl was 
identified and held fixed for the following stage. This proce- 
dure was repeated until either six predictors were used or 
until a new predictor addition did not allow the comparison 
of at least intervals two through ten, due to computer 
storage limitations. It should be noted here that all of 
the North Atlantic Ocean parameters, not including linear- 
regression equations, were used in these experiments and, 
subsequently, some parameters were removed from consideration 
(Appendix D) . 

1. Maximum a^ Method 

The first NOGAPS predictor selected was SMF which 
was varied from two to 100 EPI's (Fig. 2a) and the highest 
TSl score was obtained with six intervals. The second pre- 
dictor chosen, when SMF was held at six intervals and all 
others at ten, was DTDP which produced the highest a^ value 
for two predictors. Holding SMF at six intervals, DTDP was 
varied from two to 50 intervals (Fig. 2b) and the highest 
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TSl score was obtained at 20 intervals. Anticipating problems 
with the subsequent array size with respect to the number of 
predictors which could be included, the secondary TSl maximum 
at 16 intervals was used for further stepping. The third and 
subsequent predictors and their optimum interval sizes were 
PS at 12 (Fig. 2c) , UBLW at ten (Fig. 2d) and V400 (Fig. 2e) . 
The optim\am number of intervals for V400 was not germane as 
no further stepping was done after this step. As illustrated 
in Fig. 2, the dependent data statistics aymptotically approach 
unity, as predictors are added, while the independent data 
statistics (approximate maximum values: aQ = .70, TSl = .35) 

show no further increase after the third predictor is includd, 

# 

which may imply a limit as to how well the methodology per- 
forms on this particular data set. 

2 . Functional Dependence Method 

As functional dependence is not considered until after 
the selection of the first NOGAPS predictor. Fig. 2a is also 
applicable to this method. Subsequent predictors were chosen 
as those having the lowest RSS FD using ten equally populous 
intervals. The predictors selected and their optimum inter- 
val sizes, for the TSl score, were RH at three (Fig. 3a) , 

DUDP at four (Fig. 3b) , VOR925 at two (Fig. 3c) , ENTRN at 
14 (Fig. 3d) and UBLW (Fig. 3e) which was the last predictor 
considered. As seen for the maximum a ^ method, the dependent 
data statistics asymptotically approach unity. However the 
independent data statistics continue to grow at least through 
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the addition of the sixth predictor (approximate maximum 
values: a^ = .71, TSl = .38). This method gave better results 

than the maximum a^ method, though it, too, may imply a 
limit. The results of this experiment also tend to show a 
preferential selection of a small number of EPI's, for best 
independent data TSl score, as well as indicating that func- 
tional dependence is a relatively good choice as a deciding 
factor for choosing predictors. 

D. BEHAVIOR OF FUNCTIONAL DEPENDENCE 

Another statistic investigated prior to the formal 
application of the Preisendorf er (1983 a,b,c) methodology 
was the distribution of functional dependence (FD) calculated 
from 100 randomly generated data sets. The FD calculation is 
based upon the relationship of the distribution of one pre- 
dictor to another. Because the predictors are divided into 
the same number of EPI's for the calculation, the probability 
of a randomly generated number falling into any given inter- 
val for either predictor will be the same. Therefore, the 
randomly generated FD values should be a function only of 
the number of intervals and the number of data cases (subse- 
quent randomly generated calculations, during the formal 
application of the methodology, showed this to be true) . 

The randomly generated FD experiment consisted of com- 
puting the mean, upper and lower 5% critical values, and the 
standard deviation of the 100 randomly generated values for 
both 1526 observations (as in the North Atlantic Ocean Area 
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3W dependent data) and 3682 observations (as in the North 
Pacific Ocean dependent data) and a comparison of the 
results. As illustrated in Fig. 4 the FD values are similar 
for a given interval size differing only in the size of the 
confidence interval and the standard deviation. The FD 
values calculated for 3682 observations lie totally within 
the upper and lower 5% critical values for 1526 observations. 
Because of this relationship, future FD(96) values, used to 
qualitatively determine how well a new predictor will con- 
tribute to the developmental model, can be obtained by read- 
ing from the graph rather than using valuable computer 
resources,: providing the number of equally populous intervals 
is less than or equal to ten. 
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V. PROCEDURES 



A. PREISENDORFER METHODOLOGY 

1 . Determination of the First Predictor in Relation 
to the Number of Predictor Intervals 

A matter not considered in Preisendorfer (1983 a,b,c) 
is how to chose an optimum number of equally populous pre- 
dictor intervals (EPI's) into which predictor data should 
be divided. During the course of development, two important 
realizations became evident, namely, (a) there is a tendency 
for the methodology to give better results using a small 
number of intervals, and (b) the NPS W.R. Church Computer 
Center limits internal computer storage space to two mega- 
bytes for routine programs. The first suggested, while the 
second forced, the research to be limited to EPI's of less 
than or equal to ten if more than three or four predictors 
were to be considered. Once this was established, a proce- 
dure was developed to look at all EPI's within the stated 
limit . 

The procedure involves computing the initial statis- 
tics (Sq, a^ , CE and PP) for each predictor, for each strategy 
(maximum-probability and natural-regression) and for EPI's 
of two through ten. Then, the best first predictor for each 
number of EPI's is determined, for each strategy, by meeting 
one or both of the following conditions, when considered in 
the indicated order: 
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a. lowest CE 



b. highest PP 

Once the best predictor for each number of EPI's is 
known, it is then necessary to determine the optimum number 
of EPI's. This is accomplished by computing threat and skill 
scores (Appendix E) for both the dependent and independent 
data and choosing, as the optimum number of EPI's, that which 
gives both a relatively high adjusted a^ (AAO) for the depen- 
dent data and a relatively high adjusted threat score for 
visibility category I (ATSl) for the independent data. This 
becomes a somewhat subjective endeavor and remains as the 
only imprecise step in the methodology. 

The statistic ATSl is used on the independent data, 
instead of aQ , because it is the poor visibility categories 
(I and II) that are of primary forecast interest and their 
forecastability is manifested in their threat scores. It 
will be shown that, in general, the adjusted threat score 
for visibility category II (ATS2) and for combined visibility 
categories I and II (ATS 12) are small compared to ATSl, or 
negative, and that ATS12 is maximized when ATSl is maximized. 
Additionally, it will be shown that maximum a^ does not 
necessarily coincide with maximum ATSl in the independent 
data. Hence, if a^ was used, the optimum combination of 
predictors necessary to forecast the poor visibility cate- 
gories would not be included. 

Once the number of EPI's is established, it is fixed 
for all subsequent predictors considered for the developmental 
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model. Holding the number of intervals fixed is not an 
absolute necessity, however it allows for a much more rapid 
development of the model. Once this number is determined for 
the first predictor, it is used to calculate FD for the next 
predictor because FD is calculated using the established 
number of EPI’s. The next stage statistics a^, CE and 

PP) are also computed with each predictor divided into this 
same number of EPI's. 

2 . Choosing the Second Predictor 

The second predictor to be included in the model is 
determined from its FD on the first predictor and from the 
increase in a^ resulting from its inclusion. This is accom- 
plished by computing a^ with two predictors, namely*, the 
first predictor, as determined above, with each of the 
remaining predictors. Those predictors which do not increase 
a^ above its value as determined with the first predictor 
alone, are removed from further consideration for inclusion 
into the set of predictors in the developmental model. FD 
for each of the remaining predictors vs. the first predictor 
is computed. The remaining predictor with the lowest FD, 
on the" first predictor, is chosen as the second predictor in 
the model. 

3 . Choosing Subsequent Predictors 

Subsequent predictor determination is similar to the 
second predictor determination. Compute a^ with N predictors 
(N = 1,...,M+1; M = the number of predictors already in the 
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developmental model), that is, the first through Mth pre- 
dictors, as previously determined, and each of the remaining 
predictors. Those predictors which do not increase a^ above 
its value as determined with M predictors are removed from 
further consideration. RSS FD is computed for each of the 
remaining predictors and the one with the lowest RSS FD is 
chosen as the Nth predictor in the model. 

4 . Significance Tests 

After each stage (i.e., after each new predictor to 
be included in the developmental model is determined) it is 
necessary to determine if the results are significant. This 
is accomplished by Monte-Carlo means using the data set 
marginal probabilities of the predictors and assuming equal 
probability of occurrence for visibility categories (Appen- 
dix A) . The statistics a^ and a^-are computed for each of 
100 randomly generated data sets of a size equal to the 
number of observations in the dependent data set being tested, 
and sorted from lowest to highest. The 96th value of a^ 
(aQ(96)) and the fifth value of a^^ (a^(05)) are retained as 
the upper and lower 5% critical values. For developmental 
model results to be significantly better than chance, a^ 
must be greater than or equal to aQ(96) and a^ must be less 
than or equal to a^(05). 

5 . Terminating the Selection of Predictors 

Model development continues until any one of four 
conditions are met: 
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a. no more predictors remain to be considered. 

b. results are no longer significant. 

c. required computer region size exceeds that which is 
allowed (two megabytes at the NPS W.R. Church Computer 
Center) . 

d. independent data ATSl does not increase for two 
consecutive predictor additions. (It will be shown 
that there is a point in the development of the model 
where the skill and threat scores for the dependent 
data diverge sharply from those for the independent 
data. This condition for terminating model development 
is a subjective attempt at taking this point into 
consideration . ) 

Once the model development is complete, contingency 
tables of forecast visibility categories vs. observed visi- 
bility categories , for both the dependent and independent 
data, are constructed. From the contingency tables, threat 
and skill scores for both data sets are computed and compared. 

B. COMPARISON METHODOLOGY 

The results obtained from the Preisendorfer (1983 a,b,c) 
methodology were compared to two variations of a linear, 
least-squares regression model. The model chosen for the 
comparison is that available in the BMDP Statistical Software 
(namely BMDP2R) [University of California, 1981] using two 
new threshold schemes developed by Lowe (1984c) (Appendix B) . 
The equations developed by BMDP2R include all predictors which 
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increased R-squared (the proportion of the predictand vari- 
ance explained by the estimation of the predictand from the 
multiple regression equation) by at least 1%. An excellent 
description of this procedure is given by Best and Pryor 
(1983) , with R-squared being equivalent to their R-value. 

1. Method 1 

The first linear regression method consists of 
generating a single equation, trained on the dependent data, 
with the predictand set equal to 1, 2 or 3, corresponding to 
visibility categories I, II and III, respectively. This 
equation is used -to determine threshold values (Appendix B) 
and is then applied to the independent data. 

2 . Method 2 

The second linear regression method is based on a 
decision-tree scheme using two linear-regression equations 
trained on the dependent data. The first equation is 
generated with the predictand values set equal to zero or 
one, corresponding to combined visibility categories I and 
II (0) and visibility category III (1). The second equation 
is generated with the predictand set equal to zero or one, 
corresponding to visibility category I (0) and visibility 
category II (1). Visibility category III observations are 
ignored during this linear regression. Threshold values are 
then computed for each equation. 

When both equations and their associated threshold 
values are known, the independent data set is sorted into 
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visibility category III and visibility category 'other' by 
the first equation, and the 'other' category is sorted into 
visibility categories I and II by the second equation. 
Following the development of linear regression method 1 and 
method 2, contingency tables are constructed, skill and 
threat scores computed, and comparisons made with the results 
from the Preisendorfer (1983 a,b,c) methodology. 
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VI. RESULTS 



A. NORTH PACIFIC OCEAN 

1 . First-Predictor Selection and Interval Determination 
The first predictor selected, for equally populous 
intervals (EPI's) of four through ten was EHF (Table III). 

The constant value for a^, maximum- probability strategy, 
indicates that there is no predictability for visibility 
category II (the least frequent category in the data set) 
using a single predictor. A comparison of the dependent 
data adjusted a^ (AAO) and independent data adjusted threat 
score for visibility category I (ATSl) subjectively deter- 
mined the selection of five EPI's for the developmental 
model (Table IV; Fig. 5) . 

2 . Selecting Subsequent Predictors 

Once the number of intervals and first predictor 
were known, a new a^ computation was made with the first 
predictor and each of the remaining predictors. Only six of 
the remaining 21 predictors, CLIMO, SEHF , THF, DDWW, H510 
and RH, in combination with EHF, gave new a^ values greater 
than that for EHF alone (.697); these comprised the pool of 
predictors to be considered for further development of the 
model. Functional dependence (FD) with EHF was computed for 
each of these six predictors and DDWW was chosen as the second 
predictor because it had the lowest FD. 



38 



I 



For the determination of the third through sixth 
predictors, a new a^ was computed as a function of all of 
the previously selected predictors and each of the remaining 
predictors. At each stage, the new a^ computation for each 
remaining predictor was greater than that for the prior 
stage, so no further predictors were eliminated from con- 
sideration. FD was then computed, for each of the predictors 
being considered with each of the predictors previously 
selected, and RSS FD determined. At any given stage (three 
through six) the new predictor added to the developmental 
model was that one with the lowest RSS FD. The third through 
sixth predictors, in order of selection, are H510, RH, THF 
and CLIMO (Table V) . 

3 . Determining the Final Model 

The final model for the Preisendorfer (1983 a,b,c) 
methodology was determined by comparing the independent data 
contingency table statistics, from each developmental stage, 
and choosing the fourth stage because it gave the highest 
adjusted threat score for visibility category I (ATSl). 

(Fig. 6) . The contingency tables for stage four and the 
related statistics for the three strategies are shown in Table 
VI. 

4 . Linear Regression 

A single linear-regression equation was developed 
from the North Pacific Ocean data using method 1. Both the 
quadratic and equal-variance threshold models (Appendix B) 
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were applied but only the threshold values from the equal- 
variance model were used to compare methodologies. Table 
VII contains the linear regression equation, the visibility 
category linear regression statistics and the threshold 
values. Contingency tables and related statistics for the 
dependent and independent data are shown in Table VIII. 

5 . Discussion 

The best results obtained from the North Pacific 
Ocean data were from the Preisendorfer (1983 a,b,c) methodology, 
MAXPR0B2 strategy, as it has the highest independent data 
adjusted threat scores for visibility categories I and com- 
bined I/II (ATSl = .20, ATS12 = -.05) . Each of the maximum- 
probability strategies (MAXPROBl: ATSl = .17, ATS12 = -.10) 
did better than linear regression (ATSl = .16, ATS12 = -.13) , 
while natural-regression shows the poorest skill (ATSl = -.02, 
ATS12 = -. 19) . 

It appears, from Fig. 6, that most of the usable 
forecastability resides in the first predictor chosen. This 
would indicate that it may be profitable to search for 
better predictors by combining model output parameters, 
conducting dimensional analysis or using linear-regression 
equation estimates as predictors as was done in the North 
Atlantic Ocean experiments which follow. 

B. NORTH ATLANTIC OCEAN AREA 3W 

Based upon the results obtained in the North Pacific 
Ocean, it was decided to use the linear regression model to 
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generate equations which could be used as predictors . Seven 
such equations were developed, each representing a different 
menu of parameters available to the regression model. The 
seven equations are included in Appendix D. The Preisen- 
dorfer (1983 a,b,c) methodology then proceeded both with 
and without these linear-regression equations available as 
predictors . 

1 . First Predictor Selection and Interval Determination 

a. Without Linear-Regression Equations as Predictors 
The first predictor, for EPI's of four through 

ten, varied with the number of intervals (Table IX) . A 
comparison of the dependent data AAO and the independent 
data ATSl determined the selection of eight EPI's for the 
model (Table X) and, therefore, SMF as the first predictor. 
However, through investigator error, the model was initially 
developed with five EPI's and E925 as the first predictor. 
Therefore, both results will be presented. 

b. With Linear-Regression Equations as Predictors 
The first predictor for each EPI of four through 

ten is BMl , the predictand estimate computed by the linear 
regression equation developed when all of the predictors 
were available to the regression model (Table XI) . Two of 
the EPI's, namely four and eight, have identical, and best, 
dependent data AAO and independent data ATSl scores (Table 
XII, Fig. 7) , so it was decided to proceed with the develop- 
mental model for both intervals. 
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2 . Selecting Si±>sequent Predictors 



Subsequent predictors were chosen in the same way as 
described in the procedures and for the North Pacific Ocean 
experiment. The predictors, not including linear regression 
equations as predictors, are SMF, D850, RH, UBLW and ENTRN 
for eight EPI's (Table XIII) and E925, U700, DVDP, STRTFQ, 
ENTRN and PS for five EPI's (Table XIV) . The predictors, 
including linear regression equations as predictors, are 
BMl, U850, D500, V850, DlOOO and UlOOO for four intervals 
(Table XV) and BMl, U500, ENTRN, DVDP and BM4 for eight 
intervals (Table XVI) . Significance tests were made after 
each predictor selection and a^OS) and a^(05) values are 
included in Tables XIII, • XV and XVI. A comparison of the 
behavior of critical level statistics, as predictors are 
added, for both four and eight intervals, is shown in Figs. 

8 and 9, where array size is equal to the number of EPI's 
taken to a power equal to the number of predictors included 
at that stage. 

3 . Determining the Final Model 

The final model for the Preisendorf er (1983 a,b,c) 

methodology was determined by comparing the independent data 

contingency table statistics, from each developmental stage, 

and choosing that stage which gave the highest adjusted 

threat score for visibility category I (ATSl) . 

a. Without Linear Regression Equations as 
Predictors (Eight Intervals) 

It was determined, from Fig. 10, that the fifth 
stage gave the best results (MAXPROBl, independent data: 
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ATSl = .19, ATS2 = .03, ATS12 = -.05) . The contingency tables 
for stage five and related statistics for the three strategies 
are shown in Table XVII. 

b. Without Linear Regression Equations as 
Predictors (Five Intervals) 

It was determined, from Fig. 11, that the fifth 
stage gave the best results (MAXPROB2, independent data: 

ATSl = .25, ATS2 = .02, ATS12 = .01) . The contingency tables 
for stage five and related statistics for the three strategies 
are shown in Table XVIII. 

c. With Linear Regression Equations as 
Predictors (Four Intervals) 

It was determined, from Fig. 12, that the fourth 
stage gave the best results (MAXPROB2, independent data: 

ATSl = .40, ATS2 = -.05, ATS12 = .12) . The contingency tables 
for stage four and related statistics for the three strategies 
are shown in Table XIX. 

d. With Linear Regression Equations as 
Predictors (Eight Intervals) 

It was determined, from Fig. 13, that the second 
stage gave the best results (MAXPROB2, independent data: 

ATSl = .32, ATS2 = -.14, ATS12 = .02) . The contingency tables 
for stage two and related statistics for the three strategies 
are shown in Table XX. 

4 . Linear Regression 

Both linear regression methods (single equation and 
decision tree) and both threshold models (quadratic and 
equal variance) (Lowe, 1984a] were used to compare with the 
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Preisendorf er (1983 a,b,c) methodology in the North Atlantic 
Ocean Area 3W. Additionally, the predictors available for 
regression were varied as indicated in the following descrip- 
tion. The first regression was conducted with all available 
mop's while the second regression was conducted using only 
the best predictors from the Preisendorf er methodology (de- 
fined as those predictors which, alone, produced an a^ value 
greater than the frequency of visibility category III in the 
dependent data) . Table XXI contains the linear-regression 
equations, associated visibility category statistics and 
threshold values. Tables XXII through XXVII contain the 
contingency tables and related statistics for the dependent 
and independent data for each of the linear regression 
variations . 

5 . Discussion 

Table XXVIII summarizes each of the methodologies and 
strategies applied' to the North Atlantic Ocean Area 3W 
data. In general, the maximum-probability strategy did 
better than the other methods or strategies. Specifically, 
the best results overall were obtained by the MAXPR0B2 
strategy, using predictors computed from linear regression 
equations and four equally populous intervals . The methodology 
without linear regression equations as predictors, and all 
of the linear regression results, are about equivalent. The 
best linear regression method is the decision tree, when all 
mop's are made available to the regression model. The results 
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obtained without linear regression equations as predictors 
appear to discount the procedure established for choosing the 
number of equally populous predictor intervals, but lends 
support to the claim in Chapter V that there is a tendency 
for the Preisendorfer (1983 a,b,c) methodology to give better 
results using a small number of intervals. 
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VII. CONCLUSIONS AND RECOMMENDATIONS 



The primary objective of this study was to determine 
if the Preisendorfer (1983 a,b,c) methodology applied to the 
FNOC NOGAPS model output parameters could improve upon the 
forecasting of atmospheric marine horizontal visibility, in 
three categories, when compared to the more traditional 
method of least squares, multiple linear regression. It was 
shown that, indeed, the proposed methodology, namely, the 
maximum probability strategy, was superior when predictand 
estimates, computed from linear regression equations 
themselves, were used as predictors. 

The method of determining the number of equally populous 
predictor intervals requires further investigation. The 
results from the North Atlantic Ocean area 3W, without 
linear regression equations as predictors, showed that the 
proposed method was not the best, in that the number of inter- 
vals determined by the method was eight but better results 
were obtained with five. Additionally, only intervals of 
ten or less were considered here, due to storage limitations 
imposed by the computer center. As a result, the optimum 
nxamber of predictor intervals is inconclusive. 

Predictor determination appears to be adequate. At each 
stage of development a unique predictor was selected. The 
only foreseeable problem is if, during the first (initial) 
stage of development, multiple predictors have identical CE 
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and PP values, or, during subsequent stages, multiple pre- 
dictors have identical a^ and FD values. Should this occur, 
the model development would have to proceed, from that 
particular stage, with each of the identified predictors. 

The methodology appears to be sensitive, in two ways, to 
the first predictor selected. First, there is an initial 
large value for the independent data ATSl and small incre- 
mental increases thereafter for each new predictor added. 
Secondly, there is a large magnitude difference in the 
initial independent data ATSl values between the Preisen- 
dorfer methodology without linear regression equations as 
predictors (ATSl = .13; .14) and that with linear regression 

equations as predictors (ATSl = .30), for the maximum 
probability strategy. 

The best strategy is MAXPR0B2, followed by MAXPROBl, and 
then natural-regression. Generally, natural-regression does 
worse than linear regression. None of the methods did well 
in predicting visibility category II, which may indicate 
that visibility would be best handled as a two-category 
phenomenon . 

The number of independent data observations (1526) in 
North Atlantic Ocean Area 3W were sufficient to test the 
methodology. This was demonstrated by the similar results 
between Area 3W, without linear regression equations as 
predictors, and the North Pacific Ocean results (3682 
observations) . The small differences in the contingency 
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table statistics for the independent data for the two experi- 
ments can be attributed to parameters being from different 
models and for different months. 

The following recommendations are offered for future 
research and to future researchers; 

1. Investigate the problem of determining the optimiam 
number of equally populous predictor intervals . 

Possibly, a statistic similar to the threat scores 
or adjusted threat scores could be used, or, simply 
choose the interval, between two and ten, which gives 
the highest adjusted threat scores for the independent 
data. Alternatively, adopt, without further experimen- 
tation, the number of EPI ' s as five, which appears to 
be a compromise between a gross resolution of the 
predictor parameter range and a fine (but too expensive) 
resolution of the predictor parameter range. 

2. Investigate the use of potential predictability (PP) 
in determining the selection of predictors. During 
the initial stage of development, PP is computed for 
all available predictors and provides a measure of 
each predictor’s individual ability to forecast 
visibility, but, it is not used explicitly. Perhaps 
computing the mean and standard deviation of PP, 
during the initial stage, and removing from considera- 
tion those predictors which are not greater than a 
value equal to the mean minus one standard deviation. 
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or, simply, not greater than the mean. This would 
ensure that only those predictors which have a rela- 
tively high prospect of forecasting visibility will 
be available for subsequent selection. 

3. Search for better predictors which are particularly 

suited to visibility prediction. Recommended sources 

are: new, direct and derived, model output parameters 

(including original model output) ; non-dimensional 

parameters derived from dimensional analysis; and 

boundary-layer parameters such as the optical structure 
2 

function (Cj^) and extinction coefficients. 

4. Investigate a two-category visibility scheme. 

5. Install automatic visibility recorders on ocean-going 
military and civilian passenger/cargo ships. This 
will place visibility observations on a more objective 
basis and lead to improved methods of forecasting 
visibility, as well as verifying such forecasts. 

6. Investigate new prediction models, preferably those 
which attempt to manipulate the observed data to 
correct for probable observer bias (following Selsor, 

19 80; Renard and Thompson, 19 84) . This would be 
unnecessary if recommendation 5 was acted upon. 

7. Investigate other ocean areas and seasons to determine 
if the physically homogeneous area scheme is consistent 
and viable. Develop prediction tables and other aids 
specifically tailored to region and season. 
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8. Use a statistic other than ATSl for choosing the 
first predictor and for comparing methods and strate- 
gies. It was used in this study largely because of 
its greater magnitude, as compared to ATS2 and ATS12. 
This was due to the relatively high frequency of visi- 
bility category I in both data sets. In general, this 
will not be the case. Because three visibility cate- 
gories are being considered, and good forecasts of 
the two poorest visibility categories is desirable, a 
statistic such as ATS12 would be better suited as a 
consistent comparison statistic for future researchers. 

9. As soon as it is feasible, eliminate from further 
testing the MAXPROBl strategy in order to allow for 
more efficient and faster program execution. The 
natural-regression strategy, though it gave the poorest 
results in this study, should be re-examined when 
predictands with relatively many discrete states 
(e.g., ceiling) are considered. It has, in such 
settings, potential to out perform the more rigid 
linear regression technique. 
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APPENDIX A 



A DISCUSSION OF THE STATISTICAL PROCEDURES PROPOSED BY 
PREISENDORFER (1983 a,b,c) FOR THE FORECASTING OF 
ATMOSPHERIC MARINE HORIZONTAL VISIBILITY USING 
MODEL OUTPUT STATISTICS 

I. INTRODUCTION 

The following discussion is based upon three unpublished 
research papers by Preisendorfer (1983 a,b,c). His proposed 
methodology deals with a simple statistical manipulation of 
model output parameters (predictors) which have been trans- 
formed from continuous to discrete quantities by grouping 
each predictor into equally populous intervals. The proce- 
dural approach in applying his methodology to model output 
statistics (MOS) forecasting, is as follows: 

1. Generate predictand/ predictor pairs of data using the 
United States Navy Fleet Numerical Oceanography Center 
Navy Operational Global Atmospheric Prediction System 
(NOGAPS) model output (predictors) and synoptic ship 
visibility observations (predictand) provided by the 
Naval Oceanography Command Detachment, Asheville, NC, 
and generate bivariate plots. 

2. Generate conditional probability tables based on the 
distribution of the predictand/predictor pairs. 

3. Define prediction strategies based on the conditional 
probabilities . 
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Compute the potential predictability of visibility 
from the conditional probability tables. 

5. Compute skill scores of the prediction strategies and 
choose the first predictor. 

6. Repeat steps 1, 2 , 4, and 5, for multiple predictors. 

7. Compute functional dependence of selected vs. potential 
subsequent predictors. 

8. Choose the next predictor. 

9. Repeat steps 1, 2, 4, 5, 7, and 8, until model 
development is terminated. 

For demonstration purposes, an artificial data set of 
99 cases, consisting of four predictors plus visibility 
(predictand) , will be used throughout this discussion. 

Each predictor parameter is divided into three equally popu- 
lous intervals and visibility is divided into three categories, 
as illustrated in Table Al . The four predictors are 
Evaporative Heat Flux (EHF) , Fog Probability Parameter 
(FTER) , Relative Humidity (RH) and Air-Sea Temperature 
Difference (ASTD) . Visibility categories are defined by the 
marine visibility observation codes (MVOC) included in the 
categories . 
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TABLE A1 



ARTIFICIAL DATA SET 



Interval 1 



Interval 2 



Interval 3 



EHF < 2.65 



2.65 < EHF <4.44 
.024 < FTER £ .9 
85.9 < RH < 90.0 



EHF > 4.44 



FTER < .024 



FTER > . 9 



RH < 85.9 
ASTD < 1.02 



1.02 < ASTD < 1.91 



RH > 90.0 
ASTD > 1.91 



Visibility Category I: MVOC 90 -> 94 (60 cases) 

Visibility Category II; MVOC 95 & 96 (20 cases) 

Visibility Category III: MVOC 97 -> 99 (19 cases) 



II. SINGLE PREDICTOR STATISTICS 



A. BIVARIATE PAIRS 

Choose various visibility-predictor pairs and make 
bivariate plots of these pairs. This will provide immediate 
visual estimation of the potential predictability. As an 
example, let us suppose that predictor EHF of our artificial 
data set has 33 cases in each equally populous interval and 
that the visibility categories I, II and III are respectively 
represented by 17, 7 and 9 in interval 1; 1, 7 and 25 in 
interval 2; 1, 6 and 26 in interval 3. To make the bivariate 
plot, simply make a tabular summary of this information, as 
illustrated in Fig. 14. Now we define, from the bivariate 
plot, our coordinate system and nomenclature. Items in 
parentheses are examples from Fig. 14, numbers in brackets 
are equation numbers from Preisendorfer (1983 a,b,c) with 



53 



a letter designator indicating the paper from which it was 
obtained. 



n = number of visibility categories (n = 3) 

m = number of equally populous predictor intervals 
(m = 3) 



j = the vertical counting index (j = l,...,n) 
i = the horizontal counting index (i = l,...,m) 



n(i, j) 
n( . , j) 

n(i, . ) 



individual cell counts (n(l,3) =9) 

m 

marginal predictand totals = ^ n(i,j) = 

i=l 

row totals (n{.,2) = 20) [3.1a] 

n 

marginal predictor totals = 1 n(i,j) = 

j=l 

coliamn totals (n(2,.) = 33) [3.2a] 



n( . , . ) 



total predictand/predictor pairs = 
n m 

'I 'l n(i,j) = sum over all cells 



j=l i=l 
[3.3a] 



(n ( . , . ) 



99) 



B. CONDITIONAL PROBABILITIES 

From the bivariate pairs determine the conditional proba- 
bility of visibility given a predictor. We will continue from 
the bivariate plot in Fig. 14, and define three probabilities: 



Pl 2 (ifj) = n (i, j ) /n { . , . ) = joint probability of a 

predictand-predictor pair occurring in a 
given cell = individual cell count 
divided by the total number of cases 

= 26/99 = .2626) [3.5a] 
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p, (i) = n (i, . ) /n ( . , . ) = marginal probability of 
predictor = column total divided by the 
total number of cases = the column sum of 
the joint probabilities 
(p^(2) = 33/99 = .333) [3.6a] 

P 2 (j) = n ( . , j ) /n ( . , . ) = marginal probability of 
predictand = row total divided by the 
total number of cases = the row sum of the 
joint probabilities (p^(2) = 20/99 = .202) 

[3.7a] ^ 



We can now build a joint /marginal probability table as 
illustrated in Fig. 15, and define conditional probability. 

P 2 i(jli) = P 3_2 “ n(i, j) /n(i, .) 

conditional probability of predictand given 
a predictor = a cell's joint probability 
divided by the marginal probability of- 
predictor = individual cell count divided 
by column total 

(P2j_(2l2) = .071/. 333 = 7/33 = .212) 

[3.8a] 



Now build a conditional probability table as illustrated 
in Fig. 16. Conditional probability of visibility, given 
some predictor, is the quantity of greatest interest in this 
study. Note that if p^^^Cjli) = 1/n for j = l,...,n at 

some i (i.e., each cell contains 1/n of the cases in its 
column) , then very little information is available to predict 
visibility at that i. However, if P2i^3ol^^ ~ ^ some 

j^ and P 2 j_(jli) = 0 for all other j values, then there is 

perfect predictability of class jg by the predictor at class 
i. The underlying methodology of this study will be to 
determine the maximum conditional probability of visibility 
for each predictor value. 
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C. STRATEGIES 



Preisendorf er (1983 a,b,c) presents three different 
prediction strategies, two based on maximum probabilities 
(MAXPROBl and MAXPR0B2) and one based on natural regression. 

1 . Maximum Probability 

This strategy consists of determining the cell, in a 
given column, with the highest conditional probability, and 
assign to the column the visibility category associated with 
that cell. As each column represents an interval of predic- 
tor values, we now have a visibility forecast value associated 
with that interval. In our example with EHF (Fig. 16), 
interval 1 (i = 1) will have a forecast value of visibility 
category I (VISCAT 1) . Hence, if we used only EHF as a 
predictor, every time a value of EHF was encountered with a 
value £ 2.65, we would predict visibility category I. Simi- 
larly, for interval 2 (i = 2) and for interval 3 (i = 3) 
we would choose visibility category III (VISCAT 3) . 

MAXPROBl and MAXPR0B2 differ only in the way they 
handle a tie between maximal conditional probabilities in 
a column. Should this occur, then a decision must be made 
as to which predictand category will be assigned to that 
predictor interval. In MAXPROBl, this decision is made by 
a coin toss, figuratively. A random number, in the unit 
interval, is generated. The unit interval is divided into a 
niomber of siobintervals equal to the number of tied values 
and each subinterval is assigned to a specific predictand 
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category. The subinterval into which the random number 
falls determines the forecast visibility category. In 
MAXPR0B2, the lowest predictand category, among the tied 
categories, is chosen. 

2 . Natural Regression 

This strategy consists of first finding the average 
predictand (visibility category) for each predictor interval, 
using conditional probabilities, and then choosing the 
predictand category nearest the average. 

J(i) = I j [7.1b] 

j=l 

Fig. 17 shows the computation for EHF interval 1 (i = 1) . 
Visibility category II (VISCAT 2) would be assigned to this 
interval by this strategy. 

D. COMPARISON STATISTICS 

To determine if a predictor will be useful in forecasting, 
there should be a statistic with which to compare its poten- 
tial utility. Preisendorf er (1983 a,b,c) defines four such 
statistics and their critical values. The four statistics 
defined are potential predictability (PP) , class-error 
probabilities (aQ,aj^) , and functional dependence (FD) . 
Potential predictability and class-error probabilities will 
be defined now. Functional dependence will be addressed 
later . 
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1 . Potential Predictability 



Potential predictability of a predictand/predictor 
pair is defined as: 



m n ^ 

PP(2lD = n/(n-l) I p (i) [ ); (p2^L I ~ ^ 

i=l j=l 



m 

= ^ p, (i) PP(i) 

i=l 



where : 

n ■ 2 

PP(i) = n/(n-l) I (P 2 n(jli) - 1/n)^ , 

j=l 

p^(i) = the marginal probability of a predictor, and 

P 2 i(j|i) “ conditional probability of the jth 

predictand, given the ith predictor. [4.1a] 

PP(2|l) is loosely related to Shannon's definition of infor- 
mation [Preisendorfer , 1983a] . An example calculation is 
shown in Fig. 18 where EHF has a PP value of .330. To 
determine if this would be the best predictor using this 
statistic, compute the potential predictability for all 
predictors and rank them from highest to lowest. The 
predictor with the highest PP should be the best predictor 
for forecasting visibility using any strategy. 
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2 . Class-Error Probabilities 

Zero-class (a^) and one-class (a^) error probabili- 
ties can be defined to gauge the predictive skill of a 
prediction strategy. 



m 

1=1 

where; 

p^(i) = the marginal probability of the predictor, 

j^Ci) = the jQth cell in column i assigned by 
the prediction strategy, and 

Poi(jn(i)|i) “ conditional probability of the j»(i). 

^ [6.1a] ^ 

From Figs. 15 and 16, p^(i) = .333 for all i; jQd) = 1/ 
P 2 j^(jQ(l) Id = .515; 3^(2) = 3, p^ ( j q ( 2 ) j 2) = .758; and 
jQ(3) = 3, ^ ~ .788. Therefore, if EHF is the only 

predictor, 

aQ = (.333) (.515) + (333) (.758) + (.333) (. 788) = .686 

The statistic a^ is, by definition, equal to the fraction of 
correct forecasts in the dependent data set. 

m 

= .1^ Pi(i) [P2i(3o d) +l|i) +P2j^(jQ(i) -l|i)l 
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where ; 



p ^(j-(i) ± 1 1 i) = the conditional probabilities 
^ adjacent to the P 2 ^(jQ(i) ji) 

values used in the a^ 
determination . 



If 


jp = 1 then, by 


definition , 


^21^ 


jo<i) -1| 


i) = 0; similarly 


if 


= n then, by 


definition. 


^21 ^ 


jQ(i) +1| 


i) = 0 . [6.2a] 



The statistic a^^ is, by definition, equal to the fraction of 
forecasts for which a class 1 error has been committed. 

Again, from Figs. 15 and 16; 

aj_ = ( .333) ( .212+0) + ( . 333) ( . 212+ . 0 ) + (. 333) (. 182+0 ) 

=• .202 

To determine which one of two or more predictors is 
the most skillful, we can plot the (aQ,aj^) pairs on a skill 
diagram as in Fig. 19. The dashed lines are lines of con- 
stant class error (CE ~ more skillful 

predictors will lie on the lower right part of the triangle. 
In general, the skill on the diagram decreases according to 
the zig-zag rule shown in the figure. If, for all predic- 
tors, a^^ is constant, which may occur during the first 
predictor determination with a data set containing relatively 
few poor visibility cases, then the best predictor is that 
one with the greatest aQ value. In this instance there is 
no need to plot the pairs. 
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III. MULTIPLE PREDICTOR STATISTICS 



Once all predictand/predictor pairs have been formed 
and potential predictability and skill scores determined, 
the predictors can be ordered by decreasing predictor skill 
and by potential predictability. Fig. 20 contains the 
bivariate plot, conditional probabilities, potential pre- 
dictability and skill scores for the remaining three predic- 
tors in our artificial data set. The ordering of predictors 
is shown in Table A2 . Therefore, EHF would be chosen as 
our first predictor, as illustrated on the skill diagram 
in Fig. 19. As RH, FTER and ASTD have equal a^ and a^^ 
values, they are ranked according to decreasing potential 
predictability. 



TABLE A2 

RANKING OF PREDICTORS BY SKILL 
AND POTENTIAL PREDICTABILITY 











PP 


1st 


EHF 


.686 


.202 


.330 


2nd 


RH 


.606 


.202 


.225 


3rd 


FTER 


.606 


.202 


.211 


4 th 


ASTD 


.606 


.202 


.209 



Preisendorf er (1983b) develops statistics, similar to 
those already mentioned, for multiple predictors. The main 
conceptual difficulty of additional predictors is the 
increase of dimensions. One predictor presents a relatively 



61 



simple two-dimensional problem (predictor 1 vs. predictand) ; 
two predictors present a three-dimensional problem (predictor 1 
vs. predictor 2 vs. predictand); three or more predictors 
present four-dimensional and larger problems. However, with 
a little manipulation, all of the multi-dimensional problems 
greater than two-dimensions can be reduced to a two-dimensional 
problem. This is illustrated in Figs. 21 and 22 for three- 
dimensions (two predictors) and four-dimensions (three predic- 
tors) . An easily programmable equation can be developed to 
create these two-dimensional arrays based upon the number of 
equally populous intervals for each predictor and upon the 
interval in which a particular data case resides. 

In our continuing example, reduce the equally populous 
intervals for each predictor to an integer number (i = l,...,m) 
with 1 corresponding to the lowest interval and m correspond- 
ing to the highest interval, as defined for the predictor 
index in Section II. A. Let 



ii 


= 


the 


interval integer 


number for EHF, 


j j 


= 


the 


interval integer 


number for RH, 


kk 


= 


the 


interval integer 


number for FTER, 


mm 


= 


the 


interval integer 


number for ASTD, 


11 




the column location in the two-dimensional 
bivariate plot (equivalent to i for a 
single predictor) , 


IGPl 


=r 


the 


total number of 


intervals for EHF, 


IGP2 




the 


total number of 


intervals for RH, 


IGP3 


= 


the 


total number of 


intervals for FTER, 


IGP4 


— 


the 


total number of 


intervals for ASTD. 
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Then, for one predictor, EFH: 



11 = ii 

for two predictors, EHF and RH: 

11 = IGP2(ii-l) + jj 

for three predictors, EHF, RH and FTER: 

11 = IGP2 (ii-l+lGPl(kk-l) ) + jj 

for four predictors, EHF, RH, FTER and ASTD: 

11 = IGP2 ( ii-l+IGPl (kk-l+IGP3 {mm-l) ) ) + jj 

This equation form can be expanded to accommodate any mamber 
of predictors. 



IV. FUNCTIONAL DEPENDENCE 



After the first predictor has been selected, either from 
its skill score or potential predictability, we need a means 
to determine whether or not to add a new predictor to the 
one(s) already chosen. For this purpose, Preisendorfer 
(1983c) proposes a functional dependence index (FD) which 
describes the dependence of the new predictor being considered 
upon those already in the set of predictors. If FD is large 



63 



(on the scale 0 to 1) then it can be represented by predic- 
tors already chosen and its inclusion into the set of 
predictors would be redundant. However, if FD is small (on 
the scale 0 to 1) then it is likely to be a useful addition 
to the existing collection of predictors. 



FD(2lD 



m/2 (m-1) 



m n 

I I Pi 2 -r(i,j) I 
i=l j=l 



(2.1c) 



where : 



n-j j-1 

q(i,j) = I p„,(j+kli+l) + I p_(j-kli-l) (2.2c) 

k=l k=l . 



= the sum of the conditional probabilities 
which lie in column i+1 and rows greater 
than j and the conditional probabilities 
which lie in column i-1 and rows less than j 

= the sum of the conditional probabilities to 
the right and up, and to the left and down. 
The upper left (l,n) and lower right (m,l) 
cells will always have g values equal to zero 



j-1 n-j 

r(i,j) = I p„,(j-kli+l) + I p_,(j+kli-l) (2.3c) 
k=l k=l 



= the sum of the conditional probabilities 

which lie in column i+1 and rows less than j 
and the conditional probabilities which lie 
in column i-1 and rows greater than j 

= the sum of the conditional probabilities 

to the right and down, and to the left and up 
The upper right (m,n) and lower left (1,1) 
cells will always have r values equal to zero 
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Pl 2 (i/j) and ( j ±k | i±l) = the joint and conditional 

probabilities defined earlier, differing 
only in that the abscissa and ordinate are 
now predictor vs. predictor vice predictor 
vs. visibility. 

Fig. 23 illustrates the FD computation for RH given EHF. 

In this example, FD(2ll) = FD(RH|ehF) = .286. 

V. CRITICAL VALUES 

Once the various statistics have been found, a means to 
determine whether they are significant must be established. 
Preisendorfer (1983 a,b,c) proposes the use of Monte Carlo 
means, applied as follows. 

From the bivariate plot, as in Figs. 14, 21b and 22b, 
we determine the marginal probabilities of the predictor 
(Pj^(i)) and establish incremental values from 0 to 1 (note 
that for equally populous predictor intervals, Pj^(i) = 1/m, 
a constant, where m = the niimber of intervals) . We then cast 
a total of n(.,.) randomly generated numbers into the 
intervals to simulate a new data set. After each randomly 
generated data case is cast into a column, it is placed into 
a cell using uniform probability. Fig. 24 shows the incre- 
mental values associated with the bivariate plot in Fig. 21b. 
In our continuing example we have n(.,.) =99, so we would 
generate 99 random numbers in the unit interval. All random 
numbers £ .071 would be placed in column i = 1; those greater 
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than .071 and ^ .192 would be placed in column i = 2; and 
so on. As each data case is placed into a column, a single 
random number is generated to determine into which cell the 
case is to be placed (e.g., a random number ^ .33 would be 
counted in cell (i,l); a random number greater than .33 and 
£ .66 would be counted in cell (i,2); etc.). After all 99 
cases have been cast into their appropriate cells, all of 
the statistics previously discussed would be computed and 
saved. This process would be repeated 100 times so that we 
would have an array containing 100 randomly generated poten- 
tial predictabilities, 3.^ ' s , a^^'s and FD's. These would be 
sorted from lowest to highest and the 96th (PP(96), aQ(96), 
aj^*(96) and FD(96)) value would determine the upper 5% critical 
value and the 5th (PP(05), aQ(05), a^^COS) and FD(05)) value 
would determine the lower 5% critical value. For all statis- 
tics other than FD, we want values from our dependent data 
set to be greater than the upper 5% or less than the lower 
5% critical values. For FD we want values lower than the 
upper 5% critical value to ensure that our second, and subse- 
quent, predictor is not significantly dependent on the previous 
predictor (s) . 



VI . CHOOSING PREDICTORS 

The first predictor is determined as shown in Section III. 
That is, by computing initial PP, a^ and a^ values for each 
predictor, ordering them by skill score and PP and choosing 
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the one with the greater skill score, or greatest PP in the 
event that all skill scores are identical. 

Subsequent predictors will be subjected to two tests; 
functional dependence and skill score. Let 

p = the number of predictors already chosen, 

aQ (k-1) and aj^(k-l) = the 0- and 1-class errors 

of the previous stage of construction of the 
developmental model, 

k = the index of the current stage. 

Then, for the next (kth) predictor to be accepted it should 
meet the following three conditions : 



(1) 


FD < 


FD(96| i) 


(i = 


l,p) 


(2) 


ag(k) 


> a^Ck-l) 


and 


1 • (k-1) 


(3) 


ao(k) 


> 3^(96) 


and 


aj^(k) £ a^(05) 



If condition (1) is not met but conditions (2) and (3) are, 
then a predictor may still be used, but the increase of 
predictability of the predictand will, on average, be less 
than if condition (1) had been met. However, if conditions 
(2) and (3) are not met, then the predictor should not be 
considered further. Repeat this process at all stages for 
all remaining predictors until no further predictors are 
available, then stop the construction of the developmental 
model . 



67 



VII. TESTING THE DEVELOPMENTAL MODEL ON INDEPENDENT DATA 



Once the model has been developed and no further predic- 
tors remain to be considered, we can test it for skills 
(aQ,aj^) on an independent data set (any set whose numbers 
were not used to develop the model) . This is easily accom- 
plished by sorting the independent data case values into 
predictor intervals, determined from the dependent data, and 
calculating the location in the forecast array (11 in Figs. 
21b and 22b) of the appropriate prediction, using the equa- 
tions established in Section III. It is to be expected that 
on average the test (aQ,a^) points on the skill diagram, for 
an independent data set, will not be as skillful as on the 
set of developmental points. 
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APPENDIX B 



LINEAR REGRESSION AND THRESHOLD MODELS 
A. LINEAR REGRESSION 

In this study a least-squares, multiple linear regression 
model, known as BMDP2R in the BMDP Statistical Software 
[University of California, 1981] , was used. The procedure 
used is called forward step-wise selection and picks the 
predictors (of the many offered) that have the highest 
correlation with the predictand (visibility) based upon F-to- 
enter and F-to-remove limits, where F is a ratio which tests 
the significance of the coefficients of the predictors in 
the regression equation. 

•The regression model fitted to the data is 

y = a + b-X^ + b«x_ + . . . + b x + e 

^ 112 2 p p 

where : 

y = the dependent variable (predictand) which can 
be either a continuous function or a discrete 
value 

X, , . . . ,x = the independent variables (predictors) 
bj^,...,bp = the regression coefficients 

a = the intercept 

p = the number of independent variables 
£ = the error with mean zero. 
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The predicted value y, and the general form of the resulting 
equation, is 



y = a + b^x, + b_x„ + ... + b x 

112 2 p p 

The step-wise selection of predictors continues until there 

are no predictors remaining which meet the F-to-enter criteria. 

The regression equation generated at each step is printed, 

along with its R-value (the correlation of the dependent 

^ 2 

variable y with the predicted value y) and R . The resulting 

set of equations, one for each step, are reviewed, and that 

equation containing only those predictors which increased 
2 

R by at least .01 is retained for application. 

The role of regression, once appropriate predictor 
variables have been selected, is simply that of dimension 
reduction (representing a multivariate structure by a uni- 
variate proxy which constitutes a classif icatory or predictive 
index) . This proxy takes the form of a polynomial, linear 
in its coefficients, of the components of the multivariate 
structure. The problem now becomes one of determining the 
form of the state conditional distributions (one for each 
group of interest; e.g., 1, 2 and 3 for visibility categories 
I, II and III, as used in this study) . Once an appropriate 
form has been selected, it remains, then, to determine the 
parameters of the class conditional distributions (e.g., 
means and variances) and then apply the decision criteria or 
threshold model. 
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B. THRESHOLDS [LOWE, 1984a] 



1. Notation 



E = 


an event; this is an indicator variable which 
when E = 1, the threatening event occurs, and 
when E = 0, the non-threatening event occurs. 


C = 


the classification of an unknown event which 
when C = 1, the event is classified as a 
threat, and when C = 0, the event is classified 
as a non- threat. 


P[E =1] i 


unconditional probability of occurrence of 
threat . 


P[E =0] = 


unconditional probability of occurrence of 
non- threat . 



Error of the 1st kind (false alarm) [C = lnE=0]. 
Error of the 2nd kind (miss) [C = 0 n E = 1] . 



P[C = 1 n E =0] 


= joint probability of an error of the 1st 
kind . 


P[C =0 n E =1] 


= joint probability of an error of the 
2nd kind. 


P[C = 1| E = 0] 


E class conditional probability of misclassi- 
fying a non-threat. 


P [C = 0 1 E = 1] 


= class conditional probability of misclassi- 
fying a threat . 


P[C = 1 n E = 0] 


= P [C = ijE = 0] P[E = 0] . 


P[C = 0 n E =1] 


= P[C = 0|E =1] P[E =0] . 


z = 


a value of the predictive index (equivalent 
to y, above) . 


Z = 


range of the predictive index on the real line. 



For a dichotomous problem, Z is into two parts , Zj^, 



C = 


0 if z e Zq 


C = 


1 if z e 
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The decision regions are mutually exclusive and exhaustive 
(i.e. , Zq n = 0 and Z = Z^ u Z^) . 

Thresholds e boundary ( s) between decision regions. 

p(z[E=0) = class conditional density of z given 

that E = 0 . 

P(z1e= 1) = class conditional density of z given 

that E = 1. 

A(z) = p ( z 1 E = 1) /p ( z I E = 0 ) = the maximum likelihood 

ratio (i.e., the ratio of class conditional 
densities) . 

p^ = p{[C = lnE=0] u [C = 0nE=l]} = the total 

probability of error. 

2 . Minimum Probability of Error Criterion 

p^ = probability of an incorrect classification. 

Pg = p[C = llE=0] p[E=0] + p[C=0lE=l] p[E=l] 

where p[E=l] +p[E=0] =1. Note that the events E = 1 
and E = 0 are mutually exclusive and exhaustive. The objec- 
tive is to select decision regions (thresholds) so as to 

minimize p . 

^e 

p[C=0lE=l] = J p(zlE=l)dz = the probability of 

ZtZfl 

misclassifying E = 1 . 

p[C=0lE=l] = / p(zlE=l)dz + / p(zlE=l)dz 

ZcZq z eZ 2^ 

I p(z[E=l)dz 
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1 



p[C =0|E =1] 



1 



/ 



ZeZ. 



p (z I E = 1) dz 



p[c = i1e = 0 ] 



/ p(z [e = 0) dz 
zeZ^ 



these are 
substituted 
into the 
expression 
for p^ 



then, 



p[E=0] / p(z[E=0)dz + p[E=l][l 

Z€ 



f p (z I E = 1) dz ] 
Z € Z 2^ 



and algebraic rearrangement yields. 



p^ = p[E=l] - / {p[E=0] p(z|E=0) - p[E=l] p(z|E=l)}dz 



In order to minimize p^, (the decision region for C = 1) 
will include all those values of z for which the integrand 
in the expression for p^ will be negative. The decision 
regions can be symbolically represented as follows: 



Zq = {z: p[E=0] p(z|E=0) - p[E=l] p(zlE=l) > 0} 



Z = {z: p[E=0] p(z|E=0) - p[E=l] p(zlE=l) < 0} 



An alternative representation is given by. 



Z^ = (z: p[E=0] p(z[E=0) > p[E=l] p(z|E=l)} 



= {z; p [E = 0]/p [E = 1] > p (z 1 E = 1) /p (z I E = 0) } 
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Likewise , 



= {z: p [E = 0 ] /p [E = 1 ] < p (z I E = 1) /p (z 1 E = 0) } 

These statements can be combined to give, 

c=l 

p(z I E = 1) /p (z |E = 0) = A(z) > p[E =0]/p[E = 1] 

c=0 

Thresholds are the value (s) of z for which 
A(z) = p[E =0]/p[E =1] 

This equation can be solved for z either analytically or 
numerically depending on the forms of the density functions. 
3. Threshold Cases 

In order to exemplify the model, the assumption is 

made that the class conditional distributions are Gaussian. 

There are essentially three distinct cases that can arise. 

a. Case I; Equal variances; different means 

(Referred to as the equal variance model in the 
text) 

p(zjE=l) = k exp{ (-1/2) (z - } 

p(z|E =0) = k exp{(-l/2) (z - 



where : 



k 



(2tt) 



- 1/2 -1 
' o 
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exp{ (-1/2) (2 - y, ) /a^} c=l p 

A(z) = ^ ^ 

exp{ (-1/2) (2 - y^) /a } c=0 ^1 

where p^ = p[E=0] and p^^ = p[E=l]. Thus, the threshold 
value is 

2* = (yQ+y^^)/2 + In(pQZpj^) /(y^^ - y^) 




The position of the threshold depends on the relative values 
of Pj^ and Pq . The threshold moves toward the group with the 
smallest p^. If p^^ = p^ the threshold will be the value of 
2 where the densities intersect (i.e., where the densities 
are equal) . 

b. Case II: Equal means; different variances 

0 Qexp{ (-1/2 ) (2 - yi) ^/a^} c=l p^ 

A(z) = 5 ^ — 

CT^exp{ (-1/2) (2 - y^) / 0 Q } c=0 ^1 
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^ i 



with the threshold 



Z* 



+ 




Pq <^1 1/2 

In (;r^) 



Note that in this situation there are two thresholds. The 
group having the smaller variance will lie between the two 
thresholds. 



The thresholds shown are typical of a situation where p^^ < p^ . 
Note that these thresholds lie between the two intersections 
of the densities. If the inequality of prior probabilities 
were reversed, the thresholds would lie outside of the 
region between the two density intersections. Further note 
that the decision region for the group having the lesser 
variance lies between the thresholds. 



E= 1 



CO 

c 

(D 

Q 




Classification index ( 2 ) 
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Case III: General Solution (Referred to as 

the Quadratic Model in the text) 



c . 



p(z 


|E = 1) 


= k/a 


p(z 


|E = 0) 


= k/o 



A(z) = exp{l/2 



where k = {2-n) 



^ exp{ (-1/2) (z - y^) ^/a^} 



Q exp{(-l/2) (z -Pq) Voq} 



2 2:-y, 2 



c=l 

> Pq^ 

c=0 P3_Oq 



Algebraic manipulation produces 



(o^ -Oq)z^ + 2(0Qy^ 



c^l 



+ [ (a^yQ -a^y^) - 20 ^ 0 ^ In (PqCT^/P^Gq) ] ;; 

c=l 



which is recognizable as a quadratic equation in z. 



z* = -b ± (b^- 4ac)^/^/2a 

where : 

2 2 
a = - c^O 

b = 2(oQy^ - t^yg) 

2 2 2 2 2 2 

C = (o^yQ - o^y^) - 2aj_yQ In (PqU^/p^Oq) 
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>1 





The remarks given for the figures in cases I and II are also 
applicable here. More often than not, only one of a pair of 
thresholds induced by differing variances will be of real 
interest. If the variances of the two groups are radically 
different, then both members of. the threshold pair become 
important . 

In the foregoing, normal class conditional dis- 
tributions were assumed. This was done because the Gaussian 
form admits of a rather clean analytical solution. However, 
the general concept of the minimum probable error decision 
criteria may be applied to any form of density function. 

Indeed, the density function of one group need not even be ' 
the same form as that for another group (one might be exponen- 
tial and the other Gaussian) . The difficulty with most non- 
Gaussian forms is that they seldom admit of closed analytical 
forms and require numerical means in determination of thresholds . 
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APPENDIX C 



NORTHERN HEMISPHERE PREDICTOR PARAMETERS AVAILABLE 



FOR THE NORTH PACIFIC OCEAN, JULY 1979, EXPERIMENTS 



Area: 30®-60°N; 145°E-130°W 



Model output time: 


OOOOGMT (TAUOO) 


A. Model output 
parame ters 


Descriptive name of parameters 



Primitive equation model 



TX 


Surface air temperature 


EX 


Surface vapor pressure 


EHF 


Evaporative heat flux 


SEHF 


Sensible plus Evaporative heat flux 


THF 


Total heat flux 


H510 


1000-500 mb thickness anomaly 


GGTHTA 


Surface-front location parameter 


FTER 


Advective fog probability 



Mass structure model 



PS 


Surface pressure 


TAIR 


Surface air temperature 


EAIR 


Surface vapor pressure 


TSEA 


Sea surface temperature 


SSANOM 


Sea surface temperature anomaly 


T9 2 5 


925 mb temperature 


U9 2 5 


925 mb zonal wind component 


V9 25 


925 mb meridional wind component 


NCLOUD 


Total cloud cover 



Marine wind model 



WWW 


Marine surface wind speed 


DDWVJ 


Marine surface wind direction 



WWW 

DDWVJ 
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B. Climatological parameter 

CLIMO National Climatic Center fog 

frequency climatology 

C. Derived parameters 

TAIR-TSEA 

Surface relative humidity 



ASTD 

RH 



80 



APPENDIX D 



NOGAPS PREDICTOR PARAMETERS AVAILABLE FOR THE NORTH 



ATLANTIC OCEAN, 15 MAY-15 JULY 1983, EXPERIMENTS 



Area: Entire North Atlantic Ocean and Mediterranean Sea 



Model output time; 


1200GMT (TAUOO) 


A. Model output 
parameter 


Descriptive name of parameter 


DlOOO 


1000 mb geopotential height 


D925 

D850 

D700 


925 mb geo potential height 
850 mb geopotential height 
700 mb geopotential height 


D50 0 
D40 0 * 
D300 * 


500 mb geopotential height 
400 mb geopotential height 
300 mb geopotential height 


D250 * 


250 mb geopotential height 


TAIR 


Surface air temperature 


TIOOO 


1000 mb temperature 


T925 


925 mb temperature 


T700 


700 mb temperature 


T500 


500 mb temperature 


T400 * 


400 inb temperature 


T300 * 


300 mb temperature 


T250 * 


250 mb temperature 


EAIR 


Surface vapor pressure 


ElOOO 


1000 mb vapor pressure 


E925 


925 mb vapor pressure 


E850 


850 mb vapor pressure 


E700 


700 mb vapor pressure 


E500 


500 mb vapor pressure 


UBLW 


Boundary layer zonal wind component 


UlOOO 


1000 mb zonal wind component 


U925 


925 mb zonal wind component 
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f 




U850 


850 mb zonal wind component 


U700 


700 mb zonal wind component 


U500 


500 mb zonal wind component 


U400 * 


400 mb zonal wind component 


U300 * 


300 mb zonal wind component 


U250 * 


250 mb zonal wind component 


VBLW 


Boundary layer meridional wind 
component 


VlOOO 


1000 mb meridional wind component 


V925 

V850 


925 mb meridional wind component 
850 mb meridional wind component 


V700 


700 mb meridional wind component 


V500 


500 mb meridional wind component 


V400 * 


400 mb meridional wind component 


V300 * 


300 mb meridional wind component 


V250 * 


250 mb meridional wind component 


VOR925 ** 


925 mb vorticity 


VOR500 ** 


500 mb vorticity 


PS 


Surface pressure 


SMF 


Surface moisture flux 


PBLD 


Planetary boundary-layer depth 


STRTFQ 


Percent stratus frequency 


STRTTH 


Stratus thickness 


SHF 


Surface heat flux 


ENTRN 


Entrainment at top of marine 
boundary-layer 


DRAG ** 


Drag coefficient (Cj^) 



B . Derived parameters 



DTDP 


Vertical gradient of temperature 


DEDP 


Vertical gradient of vapor pressure 


DUDP 


Vertical gradient of zonal wind 


DVDP 


Vertical gradient of meridional wind 


RH 


Surface relative humidity 


BMl *** 


2.81132 + (.16201 X EAIR) 

- ( .00237XE850) - (.0739 xt925) 

- ( .16179XE925) 
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3 

i 

i 












BM2 *** 



2.08302 + (. 36810 X TAIR) 

- ( .26675 X TIOOO) - (. 15980 x t925) 



BM3 * * * 



3.00866 + ( .11771 x EAIR) 

- ( .01024 X E850) - ( .19321 x E925) 



BM4 * * * 



2.42235 - ( .000418 X UBLW) 
+ ( .000255 X U700) 



BM5 *** 
BM6 *** 



2.57317 + (.000893 x 01000) 
- ( .0000489 X 0700) 



2.55859 - ( .000355 X VIOOO) 



BM7 *** 



-15.2173 + ( .01764 x PS) 

- ( .01007 X STRTFQ) + ( . 02 64 2 x STRTTH) 
+ ( .06042 X SHF) 



* Parameters which were not used due to their being 
considered as having little likelihood of being 
important in forecasting marine visibility. 

** Parameters which were not used due to loss of 
significant digits during transfer from tape 
to mass storage. 

*** Linear regression equation parameters. 
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APPENDIX E 



SKILL AND THREAT SCORES 



< 

O 

LU 2 

q: 

O 

1 



1 2 3 

OBSERVED 



Total 


= R+S + T + U + 


V + W + X 


+ Y + Z 




Pi = 


(R+U+X) /Total 


P3 = 


(T+W+Z) /Total 


P2 = 


(S+V+Y) /Total 


PN = 


greatest of 


PI, P2 or P3 


Raw scores 








AO = 


% correct = (X+V+T) /Total 






A1 = 


1 - class error = 


(U+S+Y+W) /Total 




TSl = 


Threat score for 


visibility 


category I 




= 


X/(R+U+X+Y+Z) 








TS2 = 


Threat score for 


visibility 


category II 




= 


V/(U+X+V+Y+W) 








TS12 


= Threat score for 


visibility 


• categories 


I and II 



= (X+V) /(Total-T) 



TS12 is designed to represent the skill of forecasting visi- 
bility categories I and II as separate categories, rather 
than their skill as a combined category, which would be 
(U+V+X+Y) / (Total-T) . 



R 


S 


T 


U 


V 


W 


X 


Y 


z 
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Adjusted scores 



AAO 


(A0-PN)/(1-PN) 


ATSl = 


(TSl-Pl) /(1-Pl) 


ATS 2 = 


(TS2-P2)/(1-P2) 


ATS12 = 


(TS12-[P1+P2])/(1- [P1+P2]) 



85 



APPENDIX F 



TABLES 



TABLE I. A SUMMARY OF THE OBSERVATIONS (PERCENTAGE 

FREQUENCIES) OF THREE VISIBILITY CATEGORIES 
(VISCAT'S), FOR THE NORTH ATLANTIC OCEAN 
HOMOGENEOUS AREAS SHOWN IN FIG. 1, 15 MAY- 
15 JULY 1983 



NUMBER OF 



AREA 


OBERSERVATIONS 


VIS CAT I 


VISCAT II 


VISCAT III 


1 


2725 


163 


(.06) 


436 


(.16) 


2126 


(.78) 


2 


2867 


277 


(.10) 


317 


(.11) 


2273 


(.79) 


3E 


131 


8 


( .06) 


31 


( .24) 


92 


(.70) 


3W 


2288 


437 


(.19) 


2 84 


(.12) 


1567 


( .68) 


4 


4771 


129 


(.03) 


597 


( .13) 


4045 


( .85) 


5E 


1087 


9 


( .01) 


94 


( .09) 


984 


( .91) 


5W 


2307 


8- 


( .003) 


40 


(.02) 


2259 


(.98) 


6N 


580 


19 


(.03) 


45 


(.08) 


516 


( .89) 


6M 


2337 


21 


( .01) 


131 


(.06) 


2185 


(.93) 


6S 


60 


1 


(.02) 


2 


(.03) 


57 


(.95) 


7 


801 


7 


(.01) 


34 


( .04) 


760 


( .95) 


8 


1284 


1 


( .001) 


27 


(.02) 


1256 


( .98) 


ENTIRE 


NORTH ATLANTIC AND 


MEDITERRANEAN 












21,238 


10 80 


(.05) 


2038 


(.10) 


18,120 


( .85) 
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TABLE II. NUMBER OF OBSERVATIONS (PERCENTAGE FREQUENCIES) 
OF THREE VISIBILITY CATEGORIES (VISCAT'S), 

AND 95% CONFIDENCE INTERVALS FOR THE 
DEPENDENT AND INDEPENDENT DATA, FOR THE NORTH 
PACIFIC OCEAN AND AREA 3W OF THE NORTH 
ATLANTIC OCEAN 



North Pacific Ocean, July 1979 



TOTAL # OF 





VI SCAT I 


VISCAT II 


VISCAT III 


OBSERVATIONS 


95% Cl 


.207- 


.229 


.126-. 144 


.635- 


.6 60 




Dependent data 


816 ( 


.222) 


498 (.135) 


2368 


( .643) 


3682 


Independent data 


388 ( 


.211) 


246 (.134) 


1207 


( .656) 


1841 


Total 


1204 ( 


.218) 


744 (.135) 


3575 


( . 647) 


5523 


North Atlantic Ocean area 3W, 
95% Cl • .175-. 207 


FATJUN 1983 
.111-. 138 


.666- 


.704 




Dependent data 


296 ( 


.194) 


190 (.125) 


1040 


( .682) 


1526 


Independent data 


141 ( 


.185) 


94 (.123) 


527 


( . 692) 


762 


Total 


437 ( 


.191) 


284 (.124) 


1567 


( .685) 


2288 
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TABLE III 



. THE INITIAL FIVE BEST PREDICTORS FOR 
EPI'S OF FOUR THROUGH TEN, FOR EACH 
STRATEGY, WITH ASSOCIATED PP, aQ , ai 
AND CE VALUES FROM THE NORTH PACIFIC 
OCEAN DEPENDENT DATA, JULY 1979 



Maximxim-probability Natural-regression 



EPI 


Predictor 


PP 


^0 


^1 


CE 


^0 


^1 


CE 


4 


EHF 


. 328 


.684 


.135 


.497 


.491 


.467 


.551 




SEHF 


. 315 


.681 


.135 


.503 


.478 


.475 


.569 




FTER 


.317 


.680 


.135 


.505 


.482 


.468 


.568 




CLIiMO 


.296 


.657 


.135 


.551 


.471 


.478 


.580 




RH 


.311 


.649 


.135 


.567 


.508 


.442 


. 542 


5 


EHF 


. 337 


.697 


.135 


.471 


.435 


.538 


.592 




SEHF 


.319 


.688 


.135 


.489 


.535 


.400 


.530 




FTER 


.314 


.678 


.135 


.509 


.539 


. 396 


. 526 




RH 


. 312 


.658 


.135 


.549 


.449 


.518 


.5 84 




CLIMO 


.295 


.658 


.135 


.549 


.418 


.549 


.615 


6 


EHF 


. 338 


.695 


.135 


.475 


.491 


.467 


.551 




SEHF 


. 319 


.690 


.135 


.485 


.478 


.475 


.569 




FTER 


.318 


.673 


.135 


.519 


. 574 


.34 9 


.503 




RH 


. 316 


.661 


.135 


.54 3 


.508 


.442 


.542 




CLIMO 


.295 


.659 


.135 


. 547 


.471 


.478 


.580 


7 


EHF 


. 337 


.693 


.135 


.479 


.529 


.415 


.527 




SEHF- 


.319 


.685 


.135 


.495 


.523 


.417 


.537 




FTER 


.320 


.675 


.135 


.515 


.523 


.417 


.537 




CLIMO 


.297 


.661 


.135 


.543 


.435 


.528 


.602 




RH 


.314 


.659 


.135 


.54 7 


.308 


.654 


.7 30 


8 


EHF 


.338 


.688 


.135 


.4 89 


.491 


.467 


.551 




SEHF 


.320 


.681 


.135 


.503 


.478 


.475 


.569 




FTER 


.320 


.680 


.135 


.505 


.553 


.377 


.517 




CLIMO 


.301 


. 663 


.135 


.539 


.404 


.567 


.625 




RH 


.315 


.657 


.135 


.551 


.508 


.441 


.543 
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V 



I 



I 



TABLE III (CONT.) 



EHF 


. 340 


.693 


SEKF 


.322 


.686 


FTER 


.324 


.683 


CLIMO 


.299 


.663 


RH 


.315 


.657 


EFH 


.341 


.696 


SERF 


.323 


.688 


FTER 


.322 


.678 


CLIMO 


.300 


.662 


RH 


.316 


.658 



4 79 


.522 


.425 


.531 


493 


.514 


.429 


.543 


499 


.5 74 


. 349 


.503 


539 


.443 


.516 


.598 


551 


.476 


.482 


.566 


473 


.491 


.467 


.551 


489 


.534 


.401 


.5 31 


509 


.539 


.396 


.526 


541 


.418 


.549 


.615 


549 


.508 


.441 


.543 



135 

135 

135 

135 

135 

135 

135 

135 

135 

135 
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TABLE IV. FIRST-STAGE CONTINGENCY TABLE STATISTICS 
AO, TSl, AAO AND ATSl FOR BOTH DEPENDENT 
AND INDEPENDENT NORTH PACIFIC OCEAN, JULY 
1979, DATA, FOR EPI’S OF FOUR THROUGH TEN 
AND THE MAXIMUM-PROBABILITY STRATEGY, WITH 
EHF AS THE FIRST PREDICTOR FOR EACH NUMBER 
OF EPI 'S 



Dependent data Independent data 



EPI 


AO 


TSl 


AAO 


ATSl 


AO 


TSl 


AAO 


ATSl 


4 


.684 


.36 


.113 


.17 


. 686 


.34 


.087 


.16 


5 


.697 


.35 


.150 


.17 


. 695 


. 33 


.114 


.15 


6 


.695 


.32 


.145 


.13 


. 696 


. 30 


.117 


.12 


7 


.693 


.30 


.139 


.10 


.693 


.28 


.107 


.09 


8 


.688 


.27 


.126 


.06 


. 694 


.27 


.110 


.08 


9 


.693 


.36 


.139 


.17 


.695 


.34 


.114 


.16 


10 


. 696 


. 35 


.149 


.17 


. 695 


. 33 


.114 


.15 
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TABLE V. FD(96), FD, RSS FD AND aQ FOR STRATEGY 

MAXPROB2, NORTH PACIFIC OCEAN, JULY 1979, 
DEPENDENT DATA, FOR THOSE PREDICTORS 
SELECTED AT EACH STAGE OF THE DEVELOPMENTAL 
MODEL USING FIVE EPI'S. FD(96) IS COM- 
PUTED FROM 100 RANDOMLY GENERATED DATA SETS, 
AS EXPLAINED IN APPENDIX A, AND PROVIDES 
A MEASURE OF HOW MUCH ADDITIONAL PREDICTA- 
BILITY MAY BE EXPECTED FROM THE INCLUSION 
OF A NEW PREDICTOR. IDEALLY, RSS FD 
SHOULD BE LESS THAN FD(96) 



FD, of predictor added, on 

I — 1 



Predictor 
added FD(96) 


EHF 


DDWW 


H510 


RH 


RSS FD 


^0 


EHF 


- 


- 


- 


- 


- 


- 


.697 


DDWW 


.1399 


.1494 


- 


- 


- 


.1494 


.699 


H510 


.1978 


.2488 


.2185 


- 


- 


. 3311 


.704 


RH 


.2423 


.2606 


.2087 


.1515 


- 


. 3666 


.746 


THF 


.2798 


.32 90 


.1464 


.1678 


.1907 


.4408 


.820 


CLIMO 


.3128 


.3558 


.1727 


.1823 


.2551 


* 


.882 


*RSS 


FD was not computed for 


CLIMO as the 


choice for 


the 


sixth predictor 


was between only CLIMO and SEHF. 



It was more economical to compute contingency table 
statistics for each and to choose the best predictor 
from those results. 
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TABLE VI . CONTINGENCY TABLES AND RELATED STATISTICS FOR 
BOTH DEPENDENT (3682 OBSERVATIONS) AND 
INDEPENDENT (1841 OBSERVATIONS) NORTH PACIFIC 
OCEAN, JULY 1979, DATA, FROM STAGE FOUR OF 
THE DEVELOPMENTAL MODEL. PREDICTORS ARE EHF , 
DDWW, H510 AND RH , EACH DIVIDED INTO FIVE 
EPI'S, FOR (A) JHAXPROBl, (B) MAXPROB2 AND 
(C) NATURAL-REGRESSION 



(a) MAXPROBl 



DEPENDENT DATA 



cn 

< 

O 2 
UJ 

cr 

O 



316 


301 


2198 


29 


79 


29 


471 


118 


141 



1 2 3 

OBSERVED 



AO = 


. 75 


AAO = 


.29 


A1 = 


.13 






TS1 = 


.44 


ATS1 = 


.28 


TS2 = 


.14 


ATS2 = 


.01 


TS12 = 


. 37 


ATS12= 


.02 



INDEPENDENT DATA 



H 

Crt 

< 

o 

LU 

cr 

O 



175 


162 


1065 


24 


26 


35 


189 


58 


107 



1 2 3 
OBSERVED 



AO = 


.70 ■ 


AAO = 


.12 


A1 = 


.15 






TS1 = 


.34 


ATS1 = 


.17 


TS2= 


.09 


ATS2 = 


-.06 


TS12 = 


.28 


ATS12= 


1 

• 

o 



92 



1 

I 



I 



1 



TABLE VI (CONT.) 



(b) MAXPROB2 



DEPENDENT DATA 



3 

H- 

(f) 

< 

O 2 
LU 

q: 

O 

u. 

1 



228 


238 


2077 


25 


108 


63 


563 


152 


228 



1 2 3 

OBSERVED 



AO = 


.75 


AAO = 


.29 


A1 = 


.13 






TS1 = 


.47 


ATS1 = 


. 32 


TS2 = 


.18 


ATS2 = 


.06 


TS12 = 


.42 


ATS12= 


.10 



INDEPENDENT DATA 



(/> 

< 

o 

LU 

q: 

O 

LU 



135 


136 


1007 


23 


29 


48 


230 


81 


152 



1 2 3 

OBSERVED 



AO = 


.69 


AAO = 


.09 


A1 = 


.16 






TS1 = 


. 37 


ATS1 = 


.20 


TS2 = 


.09 


ATS2 = 


-.05 


TS12 = 


.31 


ATS12= 


-.05 
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TABLE VI (CONT.) 



(c) Natural-Regression 



DEPENDENT DATA 



3 

h- 

U) 

< 


75 


171 


1773 








O 2 
LU 


501 


279 


565 


cr 


• 






o 








LL 

1 


240 


48 


30 



1 2 3 

OBSERVED 



AO = 


.62 


AAO = 


-.06 


A1 = 


. 35 






TS1 = 


.27 


ATS1 = 


.06 


TS2 = 


.18 


ATS2 = 


.05 


TS12 = 


.27 


ATS12 = 


-.13 



INDEPENDENT DATA 



3 

h- 


72 


91 


857 


cn 








< 










226 


128 


298 


cr 

r>k 








UL 

1 


90 


27 


52 



1 2 3 

OBSERVED 



AO = 


.58 


AAO = 


-.21 


A1 = 


.35 






TS1 = 


.19 


ATS1 = 


-.02 


TS2= 


.17 


ATS2 = 


.04 


TS12 = 


.22 


ATS12= 


-.19 
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TABLE VII. LINEAR-REGRESSION EQUATION FOR THE PREDICTED 
VALUE OF THE VISIBILITY CATEGORY (Y) , Y 
STATISTICS WITH RESPECT TO THE ACTUAL VISI- 
BILITY CATEGORIES (Y) AND THRESHOLD VALUES 
FROM THE EQUAL-VARIANCE ASSUMPTION MODEL, 
NORTH PACIFIC OCEAN, JULY 1979. NOTATION 
IS AS IN APPENDIX B. 



y = 3.78586 + .04118(EHF) - .91412(FTER) - .01592(RH) 



Class conditional distributions (i.e., distribution of y for 
a given y) . 



I 


Number of 
observations 
of y 


Frequency 

of 

y (p) 


Mean Value 
of 

y (m) 


Standard 
deviation of 
y. (t) 


1 


816 


. 222 


2.077 




. 34 8 


2 


498 


.135 


2.263 


(roj) 


. 382 


3 


2368 


.64 3 


2.568 


(m^) 


.353 



^1 


= threshold 


between 


•— 1 
II 

>1 


and 


y = 2 = 2.506 


^2 


= threshold 


between 


y = 2 


and 


y = 3 = 1.768 




= threshold 


between 


y = 1 


and 


y = 3 = 2.048 



State conditional distributions for visibility category I 
(y = 1) , II (y = 2) and III (y = 3) depicting threshold 
values and means. 
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Density 



TABLE VII (CONT.) 
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TABLE VIII. 



CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION, FOR BOTH DEPENDENT 
(3682 OBSERVATIONS) AND INDEPENDENT (1841 
OBSERVATIONS) NORTH PACIFIC OCEAN, JULY 
1979, DATA 



DEPENDENT DATA 



(/) 

< 

O 2 
UJ 
□: 

O 

LL 



389 


34 2 


2131 


0 


0 


0 


427 


156 


237 



1 2 3 

OBSERVED 



AO= .69 


AAO = 


.14 


1 — 1 
II 

< 






TS1= .35 


ATS1 = 


.17 


TS2=0 .0 


ATS2=- 


\d 
1 — 1 

• 


TS12= .28 


ATS12= 


-.13 



INDEPENDENT DATA 



3 

H 

C/) 


189 


176 


1076 


< 

a: 


0 


0 


0 


U. 








1 


199 


70 


131 




1 


2 


3 



OBSERVED 



AO = 


.69 


AAO = 


1 — 1 
1 — 1 

• 


A1 = 


.13 






TS1 = 


.34 


ATS1 = 


.16 


TS2=o 


.0 


ATS2 = 


-.15 


TS12 = 


.26 


ATS12= 


-.13 
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TABLE IX. THE INITIAL FIVE BEST PREDICTORS FOR EPI'S 
OF FOUR THROUGH TEN, FOR EACH STRATEGY, 
WITH ASSOCIATED PP, aQ , aj_ AND CE VALUES 
FROM THE NORTH ATLANTIC OCEAN AREA 3W 
DEPENDENT DATA, 15 MAY-15 JULY 1983, 
WITHOUT LINEAR-REGRESSION EQUATIONS AS 
PREDICTORS 



Maximum-probability Natural -regress ion 



EPI 


Predictor 


PP 


^0 


^1 


CE 


^0 


^1 


CE 


4 


E850 


. 372 


.697 


.125 


.482 


.514 


.446 


.526 




SHF 


. 376 


. 691 


.125 


.493 


.512 


.455 


.521 




DTDP 


.344 


. 685 


.125 


.505 


.611 


. 304 


;4 74 




E925 


.359 


.685 


.125 


.505 


.505 


.453 


.537 




SMF 


. 334 


.682 


.125 


.511 


. 606 


. 301 


.487 


5 


E925 


.367 


.702 


.125 


.472 


.564 


. 379 


.494 




E850 


.375 


.700 


. .125 


.475 


.576 


.370 


.478 




DTDP 


.344 


.699 


.125 


.477 


.528 


.409 


.535 




SHF 


.379 


.698 


.125 


.479 


.567 


.383 


.483 




SMF 


.337 


.686 


.125 


.503 


.526 


.409 


.539 


6 


DTDP 


.353 


.710 


.125 


.456 


.568 


.360 


.503 




E850 


. 374 


.699 


.125 


.477 


.609 


.324 


.458 




SMF 


.341 


.699 


.125 


.477 


.563 


.360 


.514 




E925 


.363 


.695 


.125 


.485 


.595 


.334 


.476 




SHF 


.374 


.693 


.125 


.489 


.512 


.455 


.521 


7 


DTDP 


.356 


.716 


.125 


.443 


.514 


.429 


.542 




SMF 


.34 8 


.706 


.125 


.463 


.590 


.325 


.495 




E850 


.379 


.699 


.125 


.477 


.561 


.3 89 


.489 




E925 


.364 


.692 


.125 


.491 


.547 


.400 


.506 




SHF 


.376 


.691 


.125 


.493 


.548 


.407 


.497 


8 


SMF 


.352 


. 714 


.125 


.448 


.543 


.386 


.528 




DTDP 


.356 


. 712 


.125 


.451 


.611 


.304 


.474 




E850 


.378 


.700 


.125 


.475 


.588 


.355 


.469 




SHF 


.379 


.691 


.125 


.493 


.512 


.455 


.521 




E925 


.364 


.685 


.125 


.505 


.577 


.360 


.486 
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TABLE IX (CONT.) 



SMF 


.352 


.714 


DTDP 


.351 


.708 


SHF 


.382 


.700 


E850 


.376 


.699 


E925 


.369 


.699 


SMF 


.357 


.719 


DTDP 


.354 


.710 


E925 


.369 


.702 


E850 


.380 


.700 


SHF 


.381 


.698 



.448 


.563 


.360 


.514 


.459 


.568 


. 360 


.504 


.475 


.541 


.417 


.501 


.477 


.550 


.402 


.498 


.477 


.537 


.414 


.512 


.437 


.526 


.409 


.539 


.455 


.581 


.341 


.497 


.4 71 


.564 


.379 


.493 


.475 


.576 


.370 


.478 


.479 


.567 


.383 


.483 



125 

125 

125 

125 

125 

125 

125 

125 

125 

125 
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TABLE X 



. FIRST-STAGE CONTINGENCY TABLE STATISTICS AO, 
TSl, AAO AND ATSl FOR BOTH DEPENDENT AND 
INDEPENDENT NORTH ATLANTIC OCEAN AREA 3W, 

15 MAY-15 JULY 1983, DATA, FOR EPI'S OF FOUR 
THROUGH TEN AND THE MAXIMUM-PROBABILITY 
STRATEGY, WITHOUT LINEAR-REGRESSION EQUATIONS 
AS PREDICTORS 









Dependent 


EPI 


Best 

Predictor 


AO 


TSl 


AAO 


4 


E850 


.70 


.32 


.05 


5 


E925 


.70 


. 30 


.06 


6 


DTDP 


.71 


.32 


.09 


7 


DTDP 


.72 


.31 


.11 


8 


SMF . 


.71 


.28 


.10 


9 


SMF 


.71 


.26 


o 
1 — 1 

• 


10 


SMF 


.71 


.26 


.09 



Independent 



ATSl 


AO 


TSl 


AAO 


ATSl 


.15 


.69 


.30 


-.01 


.14 


.13 


.71 


. 30 


.05 


.14 


.15 


.71 


.29 


.05 


.13 


.14 


.71 


00 

CN 

• 


.07 


.11 


O 
• — 1 

• 


.73 


.29 


.13 


.13 


• 

o 

00 


.73 


.26 


.11 


.09 


00 

o 

• 


.73 


.24 


.15 


.06 
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TABLE XI. SAME AS TABLE IX, EXCEPT WITH LINEAR- 
REGRESSION EQUATIONS AS PREDICTORS 



Maximum-probability Natural -regress ion 



EPI 


Predictor 


PP 


^0 


^1 


CE 


^0 


^1 


CE 


4 


BMl 


.443 


. 753 


.125 


.370 


. 662 


.282 


.394 




BM3 


.427 


.742 


.125 


.392 


.665 


.270 


.400 




BM2 


.395 


.713 


.125 


.450 


.516 


.455 


.512 




BM7 


. 389 


.705 


.125 


.465 


.512 


.461 


.515 




E850 


.372 


.697 


.125 


.482 


.514 


.446 


.526 


5 


BMl 


.438 


.749 


.125 


.377 


.5 89 


. 380 


.442 




BM3 


.433 


.749 


.125 


.377 


.590 


. 374 


.446 




BM2 


.400 


.727 


.125 


.421 


.566 


.387 


.482 




BM7 


. 396 


.716 


.125 


.444 


.564 


.39 3 


.480 




E925 


. 367 


.702 


.125 


.472 


.564 


.379 


.494 


6 


BMl 


.449 


.752 


.125 


.372 


.628 


.332 


.413 




BM3 


.433 


.746 


.125 


.383 


.625 


.328 


.422 




BM7 


.404 


.725 


.125 


.425 


.604 


.338- 


.453 




BM2 


.399 


.723 


.125 


.429 


.517 


.454 


.512 




DTDP 


.353 


.710 


. 125 


.456 


.568 


.360 


.503 


7 


BMl 


.452 


.74 5 


.125 


.385 


.650 


.303 


.39 7 




BM3 


.434 


. 74 0 


.125 


.394 


.575 


.393 


.457 




BM2 


.406 


.728 


.125 


.419 


.554 


.406 


.486 




BM7 


.404 


.721 


.125 


.4 34 


.480 


.505 


. 536 




DTDP 


.356 


.716 


.125 


.443 


.514 


.429 


.542 


8 


BMl 


.453 


.753 


.125 


. 370 


.606 


.358 


.431 




BM3 


.441 


. 74 2 


.125 


. 39 2 


.601 


.358 


.440 




BM2 


.405 


.724 


.125 


.427 


.585 


.364 


.466 




BM7 


.406 


.723 


.125 


.429 


.575 


.378 


.472 




SMF 


.352 


.714 


.125 


.448 


.543 


.386 


.528 
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TABLE XI 



BMl 


.453 


. 752 


BM3 


.442 


.744 


BM7 


.410 


.723 


BM2 


.405 


.721 


SMF 


.352 


. 714 


BMl 


.456 


. 749 


BM3 


.444 


.74 9 


BM2 


.411 


.727 


BM7 


.407 


.721 


SMF 


.357 


.719 



.) 



372 


.689 


.250 _ 


. 372 


387 


.685 


.248 


. 381 


430 


.540 


.427 


.493 


4 34 


. 54 7 


.414 


.491 


448 


.563 


.360 


.514 


377 


. 704 


.235 


.356 


377 


. 647 


.301 


.404 


421 


. 576 


.377 


.471 


433 


.564 


. 393 


.480 


438 


.526 


.409 


.539 



(CONT 

125 

125 

125 

125 

125 

125 

125 

125 

125 

125 



102 



TABLE XII 



SAME AS TABLE X, EXCEPT WITH LINEAR- 
REGRESSION EQUATIONS AS PREDICTORS AND 
BMl IS THE PREDICTOR FOR EACH NUMBER 
OF EPI'S 



Dependent Independent 



EPI 


AO 


TSl 


AAO 


ATSl 


AO 


TSl 


AAO 


ATSl 


4 


.75 


.45 


.22 


.32 


.74 


.43 


.17 


.30 


5 


.75 


.42 


.21 


.28 


.75 


.41 


.17 


.28 


6 


. 75 


.41 


.22 


.27 


. 75 


.40 


CO 
1 — 1 

• 


.26 


7 


.75 


.37 


.20 


.22 


IT) 

• 


.39 


.19 


.25 


8 


.75 


.45 


.22 


.32 


. 74 


.43 


.17 


.30 


9 


.75 


.44 


.22 


.31 


.75 


.42 


CO 
1 — 1 

• 


.29 


10 


. 75 


.42 


.21 


.28 


.75 


.41 


.17 


.28 
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H 






C 


o 








X 




rH 
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LO 
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Eh 
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2 
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pq 






U 
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CO O 
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< 
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Eh 


CO 
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o w 




C 


pq 




IS 
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CJ 
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Eh 
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6h 
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Q 


Q 


Eh 


pq 
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H 
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2 ^ 










S 


U (1) 




O 




S 


X 


•H 


2 


in 




2 


Eh 


ro T5 




2 




2 


S 


(1) c 


2 


Q 


S 


2 


2 














2 
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Critical level statistics aQ(96) and a2^(05) could not be 
computed due to a 60 minute limit of central processing unit 
(CPU) time imposed by the NPS W.R. Church Computer Center. 



i! 



TABLE XIV 



. FD(96), FD, RSS FD AND aQ FOR STRATEGY 

MAXPROB2, NORTH ATLANTIC OCEAN AREA 3W, 15 
MAY-15 JULY 1983, DEPENDENT DATA, WITHOUT 
LINEAR-REGRESSION EQUATIONS AS PREDICTORS, 

FOR THOSE PREDICTORS SELECTED AT EACH STAGE 
OF THE DEVELOPMENTAL MODEL USING FIVE EPI'S. 
FD(96) IS COMPUTED FROM 100 RANDOMLY GENERATED 
DATA SETS, AS EXPLAINED IN APPENDIX A, AND 
PROVIDES A MEASURE OF HOW MUCH ADDITIONAL 
PREDICTABILITY MAY BE EXPECTED FROM THE 
INCLUSION OF A NEW PREDICTOR. IDEALLY, RSS 
FD SHOULD BE LESS THAN FD(96) . 



Predictor 

Added 


FD(96) 


FD, 


of predictor 


added, i 


Dn 


RSS FD 


1 

E925 


U700 


DVDP 


STRTFQ 


1 

ENTRN 


E925 


- 


- 


- 


- 




- 


- 


U700 


.1518 


.1510 


- 


- 


- 


- 


.1510 


DVDP 


.2147 


.1581 


.1494 


- 


- 


- 


.2175 


STRTFQ 


.2629 


.1557 


.1904 


.1427 


- 


- 


.2844 


ENTRN 


. 3036 


.1665 


.1556 


.1734 


.1387 


- 


.3178 


PS 


. 3394 


.1897 


.1779 


.1492 


• 1971 


.1495 


. 3887 



.702 
.706 
.733 
. 813 
.918 
.950 
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1 






TABLE XV. SAME AS TABLE XIII, EXCEPT WITH LINEAR REGRESSION 

equations as predictors and for four EPI'S 



LH 

o 


ro 


r— 1 


r— 1 


(J\ 




ro 




r— 1 


LO 




(J\ 




rH 


rH 


ro 


ro 


ro 


CN 


CN 


CN 


03 


• 


* 


• 


• 








LO 


LO 


LO 


CO 


O 


00 


1— 1 


CN 


CN 


CN 


r— 1 


O 


LD 


(d 


rH 


r— 1 


r-H 


r— 1 


r— 1 


O 




• 


• 


• 


* 




















KD 

(J\ 


VD 


00 






LO 


rH 




r- 


c\ 




CN 


O 


V£) 


'^O 


rn 


cn 




LO 


VD 


LO 


fd 


• 


• 






• 






ro 


LO 


VD 


CJ^ 




rH 


o 


LO 


LO 


V£) 


00 


ro 




fd 


r- 


f'- 




r- 


00 


CO 




• 


• 


• 


* 


• 


• 


Q 














Pn 




CN 


m 












m 


OD 


<T\ 


ro 




tn 


I 


CN 


00 


CN 


r— 1 


ro 


in 




CN 


CN 




LO 


LD 


p:5 




• 


• 


• 


* 


• 



O 





o 




o 




rH 




Q 


C 




0 


o 




LO 


H3 


00 


0) 


> 


TJ 








fd 


o 


5^ 


o 


0 


LO 


-p 


Q 


o 




•H 








0) 


o 


Jh 


LO 


a 


00 




D 


MH 




0 


rH 


Q 


s: 


P^ 


m 



VD 

<y\ 

Q 

Pn 



U 

o 

4J TJ 
O 0) 
•H Ti 
HD 

0) < 

J-l 



1 



LO 

o 

CN 



I 



I 



I 



I 



I 



2 

CQ 



O 


rH 


CN 


rH 


O 


(T^ 


CN 


rH 


• 


• 



<J\ 


LO 


LO 


CN 


LD 




CN 


rH 


CN 


CN 


ro 


CN 


• 


• 


* 



ro 


LD 


CN 


CN 


rH 


ro 


C3^ 


CN 


00 




O 


LO 


rH 


CN 


CN 




• 


• 


• 


• 



CN 


CN 


00 


LD 




ro 






O 


o 


CN 


CN 




00 


LD 


CN 


CN 


CN 


CN 


CN 


• 


• 


• 


• 


• 



LD 




CN 


CN 


00 


r- 


a\ 


CN 


LO 


rH 


a\ 






a\ 




rH 


CN 


ro 


ro 




• 


• 


• 


• 


• 



o 


o 


o 


o 

o 


o 

o 


LO 


o 


LO 


o 


o 


00 


LO 


00 


rH 


rH 


D 


Q 


p 


Q 


p 
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TABLE XVI. SAME AS TABLE XV, EXCEPT FOR EIGHT EPI'S 



LO 





o 




in 


CN 


fH 






'W' 


ro 




in 








iH 


ro 


ro 


CN 


Ip 


■K 




03 


• 


• 


• 


• 








LD 


m 


CO 


r- 


CO 




c-H 


CN 


CN 


o 




1 — 1 




03 




iH 




o 


o 






• 


• 


• 


• 


• 


















CD 
















O 


ro 


CO 


P 








CO 




(7\ 


ro 






o 


ro 




in 


r- 


-K 




fd 


• 


• 


• 


• 








ro 


in 


C7^ 


in 


P 




o 


in 


m 


o 


CN 






fd 




r- 


CO 


(T> 


<y\ 






• 


• 


• 


• 


• 




Q 














IP 






1 — 1 




m 








o 


(Ti 




1 — i 






1 


00 




CO 






05 




o 


1 — 1 


1 — 1 


CN 




P 




• 


• 


• 


• 


p 














0 












ro 




CP 










in 




Q 










1 — ! 


T3 


> 


1 


1 


1 


1 


P 


(U 


Q 










• 


TJ 














TJ 














03 


2 








ro 


CT> 




s 








CN 


O 


P 


E-i 








o 


P 


0 


2 


1 


1 


1 


P 


1 — 1 


-P 


H 








• 


• 


O 














•H 














np 








CO 


CD 


CO 


CD 


o 






in 




ro 


P 


o 






o 


1 — 1 


CN 


0. 


LO 


1 


1 


1 — i 


1 — 1 


P 




D 






• 


• 


• 


ip 














o 




















<T> 


1 — 1 


O 


<J^ 




1 — 1 




(T\ 


m 


CN 


1 — 1 


Q 






CO 


o 


O 


ro 


IP 


cn 


1 


O 


rP 


P 


P 




- 




• 


• 


• 




















CD 








CN 






<Ti 




CN 


a^ 


in 


in 






1 


1 — i 


m 


(j\ 


CN 




Q 






1 — i 


P 


CN 




P 




• 


• 


• 


• 




P 














O 














■P T3 














O <D 






IS 








•H n:J 




O 


p 


CP 






T3 


1 — 1 


o 


Eh 


Q 






<D 




in 


2 


> 






P 


CP 


D 


W 


Q 


cn 




CP 













(D 

-P 

o 

G 

I — \ 

P 

O 

o 



-p 

•H 

c 

p 



p 
<D 
CT>-P 
P P 
H (D 
W U 
W 



P 

( 1 ) 

-P 

P 



LO 

o 



04 a 
£ 

•H O 
03 U 
P 



-P 
— C 
MO) 
fd o 



jp 

o 

p 
p 

M-i CJ 
P O 
03 

-P CP 
^ -H • 

a^ -H 



o 

fd <D 
-P 
m p 
o P 

•H *H 
4 J £ 

w 

•H 

-P 
03 
-P 
W 



CO 

CP 

s 

0 ) 

-p 

>1 

XI 

TJ 

(D 

W 

O 

o 

£ 

•H 

( 1 ) 

£ 



fd 

o 

•H 

-P 

•H 

P 

u 



Q) JJ 
-P 

P -- 
OD 
£ CP 

o a 
o 
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TABLE XVII 



. CONTINGENCY TABLES AND RELATED STATISTICS FOR 
BOTH DEPENDENT (1526 OBSERVATIONS) AND INDE- 
PENDENT (762 OBSERVATIONS) NORTH ATLANTIC 
OCEAN AREA 3W, 15 MAY-15 JULY 1983, DATA, 
WITHOUT LINEAR-REGRESSION EQUATIONS AS 
PREDICTORS, FROM STAGE FIVE OF THE DEVELOP- 
MENTAL MODEL. PREDICTORS ARE SMF, D850, 

RH, UBLW AND ENTRN, EACH DIVIDED INTO EIGHT 
EPI'S, FOR (a) MAXPROBl, (b) MAXPROB2 AND 
(c) NATURAL- REGRESS ION 



(a) MAXPROBl 

DEPENDENT DATA 



if) 

< 

O 2 
UJ 

d: 

O 

u. 



8 




1039 


5 


178 


0 


283 


1 


1 



1 2 3 
OBSERVED 



AO= .9 8 


AAO = 


.95 


1 — 1 
o 

II 

T“ 

< 






TS1= .95 


ATS1 = 


.94 


TS2= .91 


ATS2 = 


.90 


TS12= .95 


ATS12 = 


.92 



INDEPENDENT DATA 



H 

co 

< 

o 

tu 

cr 

O 

LL 



68 


61 


452 


9 


21 


38 


64 


12 


37 



1 2 3 
OBSERVED 



AO = 


.70 


AAO = 


.04 


A1 = 


.16 






TS1 = 


. 34 


ATS1 = 


.19 


TS2 = 


.15 


ATS2 = 


.03 


TS12 = 


.27 


ATS12= 


-.05 
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TABLE XVII (CONT.) 



(b) MAXPRQB2 



DEPENDENT DATA 



to 

< 

O 2 
lU 

cc 

o 

^ 1 



0 


0 


1021 


0 


183 


10 


296 


7 


9 



1 2 3 

OBSERVED 



AO = 


.98 


AAO= -55 


A1 = 


.01 




TS1 = 


.95 


ATS1 = .94 


TS2 = 


.92 


ATS2= -90 


TS12 = 


.95 


ATS12='52 



INDEPENDENT DATA 



h- 

(0 

< 

o 

UJ 

cc 

o 



54 


52 


408 


14 


23 


57 


73 


19 


62 



AO = 


. 6 6 


AAO= _ 


.10 


A1 = 


.19 






TS1 = 


.33 


ATS1 = 


.18 


TS2 = 


.14 


ATS 2 = 


.02 


TS12 = 


.27 


ATS12=- 


.05 



1 2 3 

OBSERVED 
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TABLE XVI I ( CONT . ) 



(c) Natural-Regression 

DEPENDENT DATA 



tn 

< 

O 2 
UJ 

cr 

O 

u. 

1 



0 


10 


1031 


15 


179 


9 


281 


1 


0 



1 2 3 

OBSERVED 



AO = 


• 

00 


AAO = 


A1 = 


CM 

O 




TS1 = 


.95 


ATS1 = 


TS2 = 


• 

00 


ATS2 = 


TS12 = 


.93 


ATS12 = 



INDEPENDENT DATA 

AO= .65 AAO = 

A1= .25 

TS1= .32 ATSU 

TS2= .13 ATS2= 

TS12= ,24 ATS12= 

1 2 3 

OBSERVED 



(/> 

< 

O 

UJ 2 

cr 

O 

u. 

1 



54 


56 


407 


30 


28 


91 


57 


10 


29 
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.93 

.93 

.81 

. 90 ’ 

.15 

.16 

.01 

-.10 



TABLE XVIII. SAME AS TABLE XVII, EXCEPT FOR FIVE 
EPI'S. PREDICTORS ARE E925, U700, 
STRTFQ AND ENTRN 



(a) MAXPROBl 



DEPENDENT DATA 



AO = .92 AAO = 

A1= .05 

TS1= .77 ATS1= 

TS2= -63 ATS2= 

TS12=^75 ATS12 

12 3 

OBSERVED 



(n 

< 

O 2 
UJ 

a. 

O 

u. 

1 



36 


49 


1027 


21 


135 


4 


2 39 


6 


9 



INDEPENDENT DATA 

AO= .72 AAO = 

A1= .16 

TS1= .35 ATS1= 

TS2= .14 ATS2= 

TS12= 29 ATS12= 

12 3 

OBS ER VE D 



CO 

< 

a: 

O 



54 


60 


460 


19 


20 


27 


68 


14 


40 



DVDP, 



.74 

.71 

.57 

= .63 

.09 

.20 

.02 

-.02 
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i 

t 

I 



I 



1 

c 

t- 

I 

t 



f 



TABLE XVIII (CONT.) 



(b) MAXPROB2 



DEPENDENT DATA 



3 

cn 

< 

O 2 

LU 

d 


11 


12 


970 


2 


14 8 


36 


o 

u. 

1 


283 


30 


34 




1 2 3 

OBSERVED 


INDEPENDENT DATA 




3 

K 

CO 


43 


49 


426 


< 

22 

cr 


12 


21 


44 


CJ 

u. 

1 


86 


24 


57 




1 


2 


3 



OBSERVED 



AO = 


.92 


AAO = 


A1 = 


.05 




TS1 = 


.79 


ATS1 = 


TS2 = 


.65 


ATS2 = 


TS12 = 


.78 


ATS12= 



AO = 


.70 


AAO = 


A1 = 


.17 




TS1 = 


.39 


ATS1 = 


TS2= 


.14 


ATS2 = 


TS12 = 


.32 


ATS12= 



.74 

.73 

.60 

.67 

.03 

.25 

.02 

.01 
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I 



TABLE XVIII (CONT.) 



(c) Natural-Regression 



DEPENDENT DATA 



AO= .88 AAO = 

A1 = . 12 

TS1= •'72 ATS1 = 

TS2= .44 ATS2= 

TS12= .51 ATS12 = 

1 2 3 

OBSERVED 



<n 

< 

O 2 
UJ 

cc 

O 

Li. 

1 



3 


43 


986 


76 


14 2 


54 


217 


5 


0 



INDEPENDENT DATA 

AO— .68 AAO- 

A1 = .23 

TS1= .34 ATS1= 

TS2= .15 aTS2= 

TS12= .27 ATS12= 

1 2 3 

OBSERVED 



41 


52 


424 


39 


31 


75 


61 


11 


28 



H 

(/) 

< 




cc 

O 

Li. 



.63 

. 65 
.36 
.28 

- .05 

.19 

.03 

-.05 
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TABLE XIX. 



CONTINGENCY TABLES AND RELATED STATISTICS 
FOR BOTH DEPENDENT (1526 OBSERVATIONS) AND 
INDEPENDENT (762 OBSERVATIONS) NORTH ATLANTIC 
OCEAN AREA 3W, 15 MAY-15 JULY 1983, DATA, 

WITH LINEAR-REGRESSION EQUATIONS AS PREDICTORS, 
FROM STAGE FOUR OF THE DEVELOPMENTAL MODEL. 
PREDICTORS ARE BMl , U850 , D500 AND V850, 

EACH DIVIDED INTO FOUR EPI’S, FOR (a) MAXPROBl, 
(b) MAXPROB2 AND (c) NATURAL- REGRESSION 



(a) MAXPROBl 

DEPENDENT DATA 



(0 

< 

O 2 

111 

cc 

o 

u. 

1 



97 


120 


990 


6 


21 


5 


193 


49 


45 



1 2 3 
OBSERVED 



AO = 


.79 


AAO = 


.34 


A1 = 


.12 






TS1 = 


.50 


ATS1 = 


.37 


TS2 = 


.10 


ATS2=_ 


.02 


TS12 = 


.40 


ATS12 = 


.12 



INDEPENDENT DATA 



t- 

in 

< 

u 

UJ 

q: 

O 

u. 



45 


74 


499 


4 


5 


4 


92 


15 


24 



1 2 3 
OBSERVED 



AO = 


.78 


AAO = 


.29 


A1 = 


.13 






TS1 = 


1— 1 
LO 

• 


ATS1 = 


.40 


TS2= 


.05 


ATS 2 = 


-.09 


TS12 = 


.37 


ATS12= 


.09 
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TABLE XIX (CONT.) 



(b) MAXPROB2 



DEPENDENT DATA 



(/) 

< 

O 2 
HI 

cr 

O 

u. 

1 



77 


109 


967 


3 


21 


9 


216 


60 


64 



1 2 3 

OBSERVED 



AO = 


.79 


AAO = 


. 34 


A1 = 


. 12 






TS1 = 


.51 


ATS1 = 


.40 


TS2 = 


.10 


ATS2=- 


.02 


TS12 = 


.42 


ATS12= 


.16 



INDEPENDENT DATA 



in 

< 

o 

UJ 

a. 

O 

u. 



36 


68 


481 


3 


8 


6 


102 


18 


40 



1 2 3 

OBSERVED 



AO = 


.78 


AAO = 


.27 


A1 = 


.12 






TS1 = 


.51 


ATS1 = 


.40 


TS2 = 


.08 


ATS2 = 


-.05 


TS12 = 


.39 


ATS12= 


.12 
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t 



I 



9 

i 



FORECAST Z FORECAST 



TABLE XIX (CONT.) 



( c) Natural -Regress ion 



DEPENDENT DATA 



35 


82 


875 


131 


87 


147 


130 


21 


18 



1 2 3 

OBSERVED 



AO = 


. 72 


AAO = 


.11 


A1 = 


.25 






TS1 = 


.39 


ATS1 = 


.24 


TS2 = 


.19 


ATS2 = 


.07 


TS12 = 


.33 


ATS12 = 


.02 



DENT DATA 



3 

h- 

(J) 


24 


49 


All 


< 

O 

m 2 
cr. 


53 


38 


87 


u. 

1 


64 


7 


13 




1 


2 


3 



OBSER VE D 



AO = 


.69 


AAO = 


1 — 1 
o 

• 


A1 = 


.26 






TS1 = 


.40 


ATS1 = 


.26 


TS2 = 


.16 


ATS 2 = 


.05 


TS12 = 


.30 


ATS12= 


-.01 
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FORECAST 2 FORECAST 



TABLE XX 



. SAME AS TABLE XIX, EXCEPT RESULTS ARE FROM 
STAGE TWO IN THE DEVELOPMENTAL MODEL AND 
PREDICTORS ARE DIVIDED INTO EIGHT EPI'S 
EACH, PREDICTORS ARE BMl AND U500 



(a) MAXPROBl 

DEPENDENT DATA 



112 


130 


965 


10 


13 


9 


174 


47 


66 



1 2 3 

OBSERVED 



AO = 


.75 


AAO = 


.23 


A1 = 


.13 






TS1 = 


.43 


ATS1 = 


.29 


TS2 = 


.06 


ATS2=- 


.07 


TS12 = 


.33 


ATS12 = 


CM 

O 

• 



DENT DATA 



56 


79 


484 


1 


0 


3 


84 


15 


40 



1 2 3 

OBS ER VE D 



AO = 


. 75 


AAO = 


.17 


A1 = 


.13 






TS1 = 


.43 


ATS1 = 


.30 


TS2=0 


.0 


ATS2= - 


1 — 1 

• 


TS12 = 


.30 


ATS12=- 


.01 
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TABLE XX (CONT.) 



(b) MAXPR0B2 



DEPENDENT DATA 



< 

O 2 
UJ 

cr 

O 

u. 

1 



90 


118 


943 


3 


6 


4 


203 


66 


. 93 



1 2 3 

OBSERVED 



AO = 


.75 


AAO = 


.23 


A1 = 


.13 






TS1 = 


.45 

• 


ATS1 = 


.31 


TS2 = 


.03 


ATS2=- 


.11 


TS12 = 


. 36 


ATS12 = 


.06 



INDEPENDENT DATA 



(O 

< 

o 

UJ 

a. 

O 

u. 



46 


76 


470 


0 


0 


2 


95 


18 


55 



1 2 3 
OBSERVE D 



AO = 


.74 


AAO = 


.16 


A1 = 


.13 






TS1 = 


.44 


ATS1 = 


.32 


TS2 = 0 


.0 


ATS2= - 


.14 


TS12 = 


. 33 


ATS12= 


CM 

O 

• 
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TABLE XX (CONT.) 



(c) Natural -Regress ion 

DEPENDENT DATA 

AO= .67 AAO = 

A1= .29 

TS1= .21 ATS1= 

TS2= .15 ATS2= 

TS12=.22 ATS12= 

1 2 3 

OBSERVED 

INDEPENDENT DATA 

AO= .64 AAO = 

A1= .31 

TS1= ^23 ATS1= 

TS2= .10 ATS2= 

TS12= .18 ATS12= 

1 2 3 

OBSERVED 



if) 

< 

cc 

O 

u. 

1 



32 


64 


431 


lA 


25 


90 


35 


5 


6 



< 

O 2 
lii 

cr 

O 

u. 

1 



59 


97 


873 


170 


76 


156 


67 


17 


11 



.05 

.02 

.03 

.15 



-.15 

.06 

-.03 

-.18 
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TABLE XXI. LINEAR-REGRESSION EQUATIONS FOR THE PREDICTED 
VALUE OF THE VISIBILITY CATEGORY (Y) , FOR BOTH 
REGRESSION METHODS, Y STATISTICS WITH RESPECT 
TO THE ACTUAL VISIBILITY CATEGORIES (Y) AND 
THRESHOLD VALUES FROM BOTH THRESHOLD MODELS, 
NORTH ATLANTIC OCEAN AREA 3W, 15 MAY-15 JULY 
1983. NOTATION IS AS IN APPENDIX B 



A. Definitions: 



LRl - Linear regression method 1; single equation, 
three visibility categories 

LR2 - Linear regression method 2: Decision-tree; two 

equations, two visibility categories each 

a - All predictors were made available to the 
regression model - 

b - Only the best predictors from the Preisendorfer 
(1983 a,b,c) methodology were made available 
to the regression model 

A - Quadratic threshold model (Case III, Appendix B) 

B - Equal variance threshold model (Case I, Appendix B) 



B. LRla 



y = 2.81132 + .1620KEAIR) - .00237 (E850) - .07319 (T925) 



- .16179(E925) 



Class conditional distributions (i.e., the distribution of y 
for a given y) . 



Y. 


Number of 
observations 
of y 


Frequency 

of 

y (p) 


Mean 

of 

y (m) 


value 


Standard 
deviation of 
y (a) 


1 


296 


.194 


2.014 


(mj^) 


.4 34 


2 


190 


.125 


2.324 


(m 2 ) 


. 379 


3 


1040 


.682 


2.652 


(m^) 


.352 
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Density 



TABLE XXI (CONT.) 



LRlaA 



^1 


= threshold 


between 


y = 1 


and 


y = 2 = 2.275 


^2 


= threshold 


between 


y - 2 


and 


y = 3 = 1. 839 


^3 


= threshold 


between 


y = 1 


and 


y = 3 = 2.008 



(second threshold value, of the pair, was of no interest. 
See Appendix B) 



LRlaB 



T 

a 


threshold 


between 


y = 1 


and 


y = 2 = 2.368 


^b = 


threshold 


between 


y = 2 


and 


y = 3 = 1.768 


T 

n 


threshold 


between 


y = 1 


and 


y = 3 = 2.060 



State conditional distributions for visibility category 
I (y = 1) , II (y = 2) and III (y = 3) depicting 
threshold values and means. 
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f 



TABLE XXI (CONT.) 



C. LR2a 

Equation 1; y 



.90305 + .06122 (EAIR) + .11284 x 10 '^(D850) 
- .08438(E850) - .04083(1925) 



Class conditiona distributions 



I 

0 

1 



Number of 
observations 
of y 


Frequency 

of 

Y (P) 


Mean 

of 

y (m) 


value 


Standard 
Deviation 
of y (a) 


486 


.318 




.479 


(mo) 




. 222 


1040 


.682 




. 776 


(m^) 




.209 


LR2aA: = 


threshold 


between y 


= 0 


and 


y = 1 = .4979 


LR2aB; T 

a 


threshold 


between y 


= 0 


and 


y = 1 = .5110 



State conditional distributions for combined visibility 
categories I and II (y = 0) and visibility category III 
(y = 1) depicting threshold values and means 
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1 



1 - 



E 






TABLE XXI (CONT.) 



Equation 2; y = .01229 - .18917 x lO^(UlOOO) 

- .02088(T500) + . 1 339 x lo" ^ ( U500 ) 

+ .15259 X lO"^ (D925) - . 32705 x lO"^ (STRTFQ) 
+ 7.50153(DEDP) - .03279 (DVDP) 



Class conditional distributions 





Nimber of 


Frequency 


Mean value 


Standard 




observations 


of 


of 


deviation 


z 


of y 


y (p) 


y m 


of y (a) 


0 


296 


.609 


.319 (mQ) 


.186 


1 


190 


.391 


.503 (m^^) 


.194 



LR2aA: = threshold between y = 0 and y = 1 = .510 2 

LR2aB; = threshold between y = 0 and y = 1 = .4972 

State conditional distributions for visibility category I 
(y = 0) and II (y = 1) depicting threshold values and means. 
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TABLE XXI ( CONT . ) 



D. LR2b 

Equation 1: y = .89952 - .04830(E850) + .02472(SHF) 

+ 2 .17081 (DTDP) + 6. 81684 (DEDP) 



Class conditional distributions 



Z 



Number of 
observations 
of y 



Frequency 

of 

y (P) 



Mean value 
of 

y (m) 



Standard 

deviation 

of y (g) 



0 486 

1 1040 



.318 .496 (m^) .220 

.682 .768 (mj_) .201 



LR2bA : 



threshold between y = 0 and y = 1 = .4922 



LR2bB : T 

a 



threshold between y = 0 and y = 1 = .5119 



State conditional distributions for visibility categories 
I and II (y = 0) and visibility category III (y = 1) 
depicting threshold values and means. 
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TABLE XXI (CONT.) 



Equation 2; y = 


.71769 + .11439 


X 10 ^(V700) - 


.47810 X 10 ^(STRTFQ) 




+ 4.5433(DTDP) 








Class conditional 


distributions 








Number of 


Frequency 


Mean 


value 


Standard 


observations 


of 


of 




deviation 


y of y 


y (p) . 


y (m) 




of y (a) 


0 296 


.609 


. 337 


(mo) 


.164 


1 190 


.391 


.476 


(m^^) 


.177 



LR2bA; 
LRabB ; T 

a 



threshold between y = 0 and y = 1 = .5208 
threshold between y = 0 and y = 1 = .4978 



State conditional distributions for visibility category I 
(y = 0) and II (y = 1) depicting threshold values and means. 
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TABLE XXII. 



CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION METHOD 1 (SINGLE 
EQUATION) , QUADRATIC THRESHOLD MODEL, FOR 
BOTH DEPENDENT (1526 OBSERVATIONS) AND 
INDEPENDENT (762 OBSERVATIONS) NORTH 
ATLANTIC OCEAN AREA 3W, 15 MAY-15 JULY 1983, 
DATA, WITH ALL PREDICTORS AVAILABLE TO THE 
REGRESSION MODEL 



LRlaA (Table XXI) 

DEPENDENT DATA 



AO= .75 AAO= .21 

A1= .12 

TS1= .38 ATS1= -23 

TS2=0-0 ATS2="*1'^ 

TS12=.27 ATS12=:-.07 

1 2 3 
OBSERVED 



C/) 

< 

O 2 
ID 

cr 

O 

LL 



152 


151 


996 


0 


0 


0 


144 


39 


44 



INDEPENDENT DATA 



CO 

< 

o 

UJ 

q: 

O 

u. 



69 


80 


49 8 


0 


0 


0 


72 


14 


29 



1 2 3 

OBSERVED 



AO = 


. 75 


AAO = 


.18 


A1 = 


.12 






TS1 = 


.39 


ATS1 = 


.25 


TS2=0 


.0 


ATS2= - 


.14 


TS12 = 


.27 


ATS12=- 


LO 

O 

• 
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TABLE XXIII 



SAME AS TABLE XXII, EXCEPT USING THE 
EQUAL-VARIANCE THRESHOLD MODEL 



LRlaB (Table XXI) 

DEPENDENT DATA 



3 

I- 

co 

< 

O 2 
UJ 

cr 

O 

u. 

1 



INDEPENDENT DATA 



3 

H 

</) 

< 



cr 

O 

LL 

1 



1 2 3 
OBSERVED 



65 


78 


492 


0 


0 


0 


76 


16 


35 



135 


14 7 


9 84 


0 


0 


0 


161 • 


43 


56 



1 2 3 

OBSERVED 



AO = 


.75 


AAO = 


.22 


A1 = 


.12 






TS1 = 


.41 


ATS1 = 


.27 


TS2 = 


0.0 


ATS2=_ 


1 — 1 

• 


TS12 = 


.30 


ATS12 = 


-.0 



AO = 


.75 


AAO = 


.17 


A1 = 


.12 






TS1 = 


.40 


ATS1 = 


.26 


TS2= 


o 

• 

o 


ATS2= - 


.14 


TS12 = 


.28 


ATS12=- 


.04 
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I 



1 



TABLE XXIV 



CONTINGENCY TABLES AND RELATED STATISTICS 
FROM LINEAR REGRESSION METHOD 2 (DECISION- 
TREE) , QUADRATIC THRESHOLD MODEL, FOR BOTH 
DEPENDENT (1526 OBSERVATIONS) AND INDEPENDENT 
(762 OBSERVATIONS) NORTH ATLANTIC OCEAN AREA 
3W, 15 MAY-15 JULY 1983, DATA, WITH ALL 
PREDICTORS AVAILABLE TO THE REGRESSION MODEL 



LR2aA (Table XXI) 

DEPENDENT DATA 



I- 

co 

< 

o 

UJ 

cr 

O 

u. 



10 5 


118 


945 


11 


28 


19 


180 


44 


76 



1 2 3 
OBSERVED 



AO = 


.76 


AAO = 


.23 


A1 = 


.13 






TS1 = 


.43 


ATS1 = 


.30 


TS2 = 


.13 


ATS2 = 


.00 


TS12 = 


.36 


ATS12= 


.06 



INDEPENDENT DATA 



3 

H- 

if) 


52 


68 


4 74 


< 

O 

uj 2 

cr 

o 


11 


8 


6 


w 

Li. 

1 


78 


18 


47 



1 2 3 
OBS ER VE D 



AO = 


.73 


AAO = 


1 — 1 

• 


A1 = 


.14 






TS1 = 


.38 


ATS1 = 


.24 


TS2= 


.07 


ATS2 = 


-.06 


TS12 = 


.30 


ATS12= 


-.01 
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! 



\ 



TABLE XXV. SAME AS TABLE XXIV, EXCEPT USING THE 
VARIANCE THRESHOLD MODEL 



LR2aB (Table XXI) 

DEPENDENT DATA 

AO= .76 AAO = 

A1= .13 

TS1= .44 ATS1 = 

TS2= -13 aTS2= 

TS12= ,37 ATS12 

1 2 3 

OBSERVED 

INDEPENDENT DATA 

AO= .73 AAO = 

A1= .14 

TS1= .38 ATS1= 

TS2= .08 ATS2= 

TS12= 30 ATS12 = 

12 3 

OBSERVED 



to 

< 

O 

UJ 2 
(T 

o 

u. 

1 



49 


67 


464 


12 


9 


13 


80 


18 


50 



to 

< 

O 2 

UJ 

cr 

O 

u. 

1 



96 


116 


938 


15 


30 


26 


185 


44 


76 



EQUAL- 



.23 

.31 
: -01 

= .07 

.11 

.24 

-.05 

=-.01 
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TABLE XXVI 



. CONTINGENCY TABLES AND RELATED STATISTICS FROM 
LINEAR REGRESSION METHOD 2 (DECISION-TREE) , 
QUADRATIC THRESHOLD MODEL, FOR BOTH DEPENDENT 
(1526 OBSERVATIONS) AND INDEPENDENT (762 
OBSERVATIONS) NORTH ATLANTIC OCEAN AREA 3W, 

15 MAY-15 JULY 1983, DATA, WITH ONLY THOSE 
PREDICTORS IDENTIFIED AS BEST BY THE 
PREISENDORFER METHODOLOGY AVAILABLE TO THE 
REGRESSION MODEL 



LR2bA (Table XXI) 

DEPENDENT DATA 



3 

H 

co 

< 

O 2 

LU 

tr 

O 

u. 



116 


127 


952 


5 


10 


13 


175 


53 


75 



1 2 3 
OBSERVED 



AO = 


.75 


AAO = 


.20 


A1 = 


.13 






TS1 = 


.41 


ATS1 = 


.27 


TS2 = 


.05 


ATS2 = 


1 

• 

o 


TS12 = 


.32 


ATS12 = 


1 — 1 

o 

• 



INDEPENDENT DATA 



54 


72 


475 


4 


1 


7 


83 


21 


45 



1 2 3 
OBS ER VE D 



10 

< 




cr 

O 

u. 



AO = 


.73 


AAO = 


.14 


A1 = 


.14 






TS1 = 


.40 


ATS1 = 


.26 


TS2= 


.01 


ATS2 = 


-.13 


TS12 = 


.29 


ATS12= 


-.02 
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TABLE XXVII 



SAME AS TABLE XXVI, EXCEPT USING THE 
EQUAL -VARIANCE THRESHOLD MODEL 



LR2bB (Table XXI) 



DEPENDENT DATA 



3 

1 - 

(j) 

< 

O 2 

111 

cr 

O 

u. 

1 



10 5 


116 


933 


8 


14 


23 


183 


60 


84 



1 2 3 

OBSERVED 



AO = 


.74 


AAO = 


.19 


A1 = 


.14 






TS1 = 


.42 


ATS1 = 


CO 


TS2 = 


.06 


ATS2 = 


- .07 


TS12 = 


. 33 


ATS12 = 


.02 



INDEPENDENT DATA 



3 

1- 

U) 


51 


71 


465 


< 

o 

tu 2 
cr 

r>i 


5 


3 


10 


U. 

1 


85 


20 


52 




1 


2 


3 



OBSERVED 



AO = 


.73 


AAO = 


.11 


A1 = 


.14 






TS1 = 


.40 


ATS1 = 


.26 


TS2= 


.03 


ATS2 = 


-.11 


TS12 = 


.30 


ATS12= 


-.02 
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TABLE AAViil. BUMMAKY UE THE LUWTilNUElNUY TABLE BTATlbTiLB EUK ALL MOB 

VARIATIONS USED IN THE NORTH ATLANTIC OCEAN AREA 3W, 

15 MAY-15 JULY 1983 



P 

o 

x: 

4J 



fO u 

U fO 
0) 
c 






x: 

4J 



>1 

0^ 

O 



-P 

Q) (D 

e > 

•H 
^ 4-< 
O 

0^ 
JQ c 



u 
o 
o 

.4-1 

XI 

- 



U 



fO 



c 

•H 



fC 

(D 

e 



o u 

•H fO 
0) 

C C 



0 

e 

fC 



cn 

0 

p 



fd 

X) 

fO 



CN 
•— 1 


CN 


f-H 


LD 


ID 


LD 


O 


<y\ 


CN 


1—1 


1 — 1 


CN 


00 


LO 




i-H 


1-H 


CN 


CN 


CO 


O 


O 


O 


O 


O 


•—1 


o 


1 — 1 


O 


o 


O 


1-H 


o 


o 


O 


O 


O 


O 








































< 


1 




1 


1 


1 


1 






1 


1 




1 


1 


1 


1 


1 


1 


1 



CN 

CO 

Eh 

< 





rH 




CO 


0 • 


0 • 






rH 0) 


T5 W 


4J 


< 


0 - 


0 - 


G 




'Td H 


x: M 


0 




0 


4J (X 




o 


x: w 


0 w 


C 





0 

a 

0 

C 

H 



fd 


CO 




ro 


CO 








o 








00 


p 








< 


m 






o^ 












00 ^ 




1 — 1 












a\ 


c 




' — ' 


C 










rH 


fd 






fd 








CN 


— ' 






P 






• 




1 — i 




CO 




0 


CO 




M 




CO 


u 


P 




4H 


P 


• 


X 




Eh 


0 


o 




P 


0 


CO 


X 




C 


4H 


-P 


• 


0 


4J 


- 








U 


u 


CO 




U 


M 


0 






0 


•H 


- 


C 


•H 




rH 




CN 


T3 


M 


0) H3 


w 


X 




CO 


c 


0 




n:J 


0 




fd 




Eh 


0 


P 


w 


CO 


P 


-p 






C 


m 


a 




•H 


a 


x: 








•H 




-p 


0 






c 






0 


CO 


x: 


P 


CO 


•H 


•H 


fd 


1 — i 


u 


fd 




fX 


fd 


0 




-P 


CO 


(X 




•H 








CO 


fd 


Eh 




CO 


0 


0 


CO 




fd 




< 


0 


G 
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C 


c 








x: 
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CO 
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•H 


CO 
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S 4J 
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fd 
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fd 
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0 


13 
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-p 
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u 


cr 




4h 


CP 
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0 




4H 


0 


-P 
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0 


-P 


a 








a 


CO 




CJ 


fd 


0 


rH 


W 
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0 


-p 


c 


X 


4J 


o 


< 


4J 


o 


0 


rH 


0 


0 


0 






rH 


•H 
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•H 




c 






Id 


CO 
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CO 


CO 










cn 


CO 




0 


CO 




c 
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0 




P 


0 


IS 
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P 


p 


LD 




P 


fX 
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s 


W 


CP 
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CO 






CO 


0 
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0 




CO 
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p 




-P 


P 
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4J 




CO 


fd 




fd 
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HD 

O 
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4J 



C -H 


fd 
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CO 


P 


0 


•H rH 


CO 
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fd 
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00 


0 




LO 


00 


s 


S 


G 


CO 


S 


s 


(X 


Pu 


•H 


o 


CX 


(X 


PQ 
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s 
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CN 


00 


00 


CN 


rH 


o^ 


LO 


LO 






00 






CD 


LO 


00 


rH 
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O 
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o 
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o 


rH 
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FIGURES 
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FIG. 1. Homogeneous Areas for the North Atlantic Ocean 
June and July, from Lowe (1984b) 



BO t THRERT SCORES 
0.0 0.2 0.1 0.6 o.a 




EOUPLLT POPULOUS INTE.RVBLS 



Fig. 2a. The behavior of contingency table statistics 
for dependent (AO — dashes, TSl--solid) and 
independent (AO — chaindots, TSl — chaindashes) 
data, as the number of EPI's is varied, for 
the North Atlantic Ocean area 3W, 15 May-15 
July 1983, when predictors are chosen based upon 
the maximum increase of sq in the dependent 
data, for (a) a single predictor (SMF) , (b) two 

predictors, (c) three predictors, (d) four 
predictors, and (e) five predictors. Numbers 
in parentheses represent the number of EPI's 
which was fixed for the indicated parameter so 
that the number of EPI's for the next predictor 
could be varied. 
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1 




Fig. 2b. Saae as Fig. 2a, except for two predictors {5.1r (6) 
and DTDP) . 




Fig. 2c. Saae as Fig. 2a, 

(S«F(6) , DTDP (16) and PS) . 



except xor 



three predictors 
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Fig - 



o 

d 



1 | — ■ I 1 1 ; 1 — » 1 1 ' ■ ■ I 

0 2 4 6 8 :0 12 M 16 10 20. 

EOUBLLT POPULOUS INTERVSLS 



2d. Same as Fig. 2a, except for roar predictors 
(S.^F(o) , DTD? (16), PS (12) and UEL'^) . 




Fig . 



2e. Same as Fig 
(SaF (6) , DTDP ( 16) , 



2a, except 
PS (12) , UBLH (10) 



for fxve 
and V400) 



predictors 
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Fig. 



3d 

ti 

tw 

(C 



. Same as Fig. 


2a, 


exce 


rst, are selected 


by h a V 


inq 


o predictors (S?1F < 


6t and 


EH) 


) roar predictors, 
edict ors. 


(d) r 


IV e 



pt pred 
the ijw 
r (c) 
f reuict 



ictoL's, after 
est RSS FD tor 
three predicto 
ort>, aac (e) 



tae 

( ‘ 
r 
SIX 
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•0^ P 



Fiq . 



f ig . 




3b. Saae as Fivj. 3a, 
(SMFCb), RH(3) aud DDD?) . 
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fiq. 3d. Same as Fi-j. 3a, except for five preixctors 
(SMf(6), HH(3) , DUDP (4) , VOR925 (2) and ENTRN) . 
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Fig. 4. The behavior of functional dependence (FD) as 

determined from 100 randomly generated data sets 
(Preisendorfer , 1983c) for EPI's of two through ten 
for (a) the North Atlantic Ocean area 3W, 15 May- 
15 July 1983, dependent data (1526 observations) 
and (b) the North Pacific Ocean, July 1979, dependent 
data (3682 observations). Plotted are FD(96) 

(upper dashed) , FD(05) , (lower dashed) , mean FD 
(solid) and standard deviation (x 100) (chaindashes) 
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Fig. 5. First stage contingency table statistics AAO , 

dependent data (solid), and ATSl, independent data 
(dashed). North Pacific Ocean, July 1979, as a 
function of the number of EPI's, from the Preisen- 
dorfer (1983 a,b) methodology. EHF is the predictor 
for all EPI's. 
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Fig. 6. Contingency table statistics AAO and ATSl for both 
dependent and independent North Pacific Ocean, July 
1979, data as a function of the number of predictors 
in the model for strategies (a) MAXPROBl and (b) 
MAXPR0B2. Predictors are EHF, DDWW, H510, THF and 
CLIMO , each divided into five EPI's. Negative 
values are not plotted. 
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Same as Fig. 5, except for the North Atlantic 
Ocean area 3W, 15 May-15 July 1983. BMl is the 
predictor for all EPI's. 




Fig. 8. Behavior of aQ (96) (upper solid), aQ(05) (lower 

solid) , aj_(9 6) (upper dashed) , a 2(0 5) (lower dashed) , 
PP(96) (upper dotted) and PP(0 5) (lower dotted) from 
100 randomly generated data sets, using predictors 
from the North Atlantic Ocean area 3W experiment, 
with each predictor divided into four EPI's, for (a) 
as each predictor is added and (b) as the forecast 
array size increases (forecast array size, at any 
given stage, is equal to the number of EPI's taken 
to the nth power, where n is equal to the number of 
predictors included at that stage) . 
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Fig. 9. 



Same as Fig. 8, except each predictor is divided 
into eight EPI's. 
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Fig. 10. Contingency table statistics AAO and ATSl for both 
dependent and independent North Atlantic Ocean area 
3W, 15 May-15 July 1983, data, without linear- 
regression equations as predictors, as a function 
of the number of predictors in the model for 
strategies (a) MAXPROBl and (b) MAXPROB2. Pre- 
dictors are SMF, D850, RH , UBLW and ENTRN, each 
divided into eight EPI’s. Negative values are not 
plotted . 
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Fig. 11. Same as Fig. 10, except predictors are E925, U700, 
DVDP, STRTFQ, ENTRN and PS, each divided into five 
EPI 's . 
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Fig. 12. Contingency table statistics AAO and ATSl for both 
dependent and independent North Atlantic Ocean area 
3W, 15 May-15 July 1983, data, with linear regression 
equations as predictors, as a function of the number 
of predictors in the model for strategies (a) MAX- 
PROBl and (b) MAXPR0B2. Predictors are BMl, U850, 
D500 , 'V850 , DlOOO and UlOOO, each divided into four 
EPI 's . 
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Fig. 13. Same as Fig. 12, except predictors are BMl, U500, 

ENTRN, DVDP and BM4 , each divided into eight EPI's. 
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Fig. 16. Conditional probabilities of VIS CAT ' s as a 
function of EPI's for EHF. 
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Fig. 17. Sample calculation of the average visibility 

category (VISCAT) , natural-regression strategy, 
for the first EPI (i = 1) of predictor EHF. 
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Fig. 19. Skill diagram with lines of constant a^ + 2a2 • 
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Bivariate plot (II = lGP2(ii-1)+jj) 
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Fi<j- 22b. Uuduction ot Uiu £ on l- dimeiuji ona 1 puobiirni, in 
Fig- 22a., to tvo dimensions. 
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